Final Project¶
A. Introduction¶
a.It is recommended that you complete the introduction after you have finished all the other parts.
Ok.
b. State whether you solved your project as a regression or classification problem.
Regression problem.
c.The general proposal comments including instructions for how to convert a regression problem to a classification problem if that is what you are interested in working on.
Ok.
d.Describe the major findings you have identified based on your analysis.
The linear regression models using all Continuous inputs with linear additive features would be the best model for the spotify songs data set in 2004.
e.Which inputs/features seem to influence the response/outcome the most?
Model 2: energy and loudness.
f.What supported your conclusion? Was it only through predictive models?
The cross validation results supported my conclusion, not only through the predictive models.
g.Could EDA help identify similar trends/relationships?
Yes, the trends (negative or positive) via EDA were consistent with that from params of model 2.
h.Was clustering consistent with any conclusions from the predictive models?
I am not so sure but the distribution of energy-track_popularity group by hierarchy clusters are similar with that of loudness-track_popularity. This is consistent with the linear additive features energy and loudness.
i.What skills did you learn from going through this project?
The concepts of
1.Exploratory data analysis with fancy visualization methods.
2.Two main areas in unsupervised learning(data discovery): clustering, to find similar subgroups within the data and dimensional reduction, to create a few new variables to represent/explain original data.
3.Part of performing supervised learning(predictive analytics): Fitting and predicting models, performance metrics, cross-validations.
j.This is not related to application or project inputs/outputs directly. What general skills can you take away from the project to apply to applications more specific to your area of interest?
The visualization skills could be the most direct taken-away to my work major in clinical pharmaceutical studies. Besides, The logistic regression for Binary classification could also be utilized with clinical data.
B. EDA¶
a.Basic Information¶
Import modules¶
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns
Read data¶
data_url = 'https://raw.githubusercontent.com/rfordatascience/tidytuesday/master/data/2020/2020-01-21/spotify_songs.csv'
df = pd.read_csv(data_url)
df.info()
<class 'pandas.core.frame.DataFrame'> RangeIndex: 32833 entries, 0 to 32832 Data columns (total 23 columns): # Column Non-Null Count Dtype --- ------ -------------- ----- 0 track_id 32833 non-null object 1 track_name 32828 non-null object 2 track_artist 32828 non-null object 3 track_popularity 32833 non-null int64 4 track_album_id 32833 non-null object 5 track_album_name 32828 non-null object 6 track_album_release_date 32833 non-null object 7 playlist_name 32833 non-null object 8 playlist_id 32833 non-null object 9 playlist_genre 32833 non-null object 10 playlist_subgenre 32833 non-null object 11 danceability 32833 non-null float64 12 energy 32833 non-null float64 13 key 32833 non-null int64 14 loudness 32833 non-null float64 15 mode 32833 non-null int64 16 speechiness 32833 non-null float64 17 acousticness 32833 non-null float64 18 instrumentalness 32833 non-null float64 19 liveness 32833 non-null float64 20 valence 32833 non-null float64 21 tempo 32833 non-null float64 22 duration_ms 32833 non-null int64 dtypes: float64(9), int64(4), object(10) memory usage: 5.8+ MB
1.Show the number of rows and columns
df.shape
(32833, 23)
2.The variable names
3.The data types
df.info()
<class 'pandas.core.frame.DataFrame'> RangeIndex: 32833 entries, 0 to 32832 Data columns (total 23 columns): # Column Non-Null Count Dtype --- ------ -------------- ----- 0 track_id 32833 non-null object 1 track_name 32828 non-null object 2 track_artist 32828 non-null object 3 track_popularity 32833 non-null int64 4 track_album_id 32833 non-null object 5 track_album_name 32828 non-null object 6 track_album_release_date 32833 non-null object 7 playlist_name 32833 non-null object 8 playlist_id 32833 non-null object 9 playlist_genre 32833 non-null object 10 playlist_subgenre 32833 non-null object 11 danceability 32833 non-null float64 12 energy 32833 non-null float64 13 key 32833 non-null int64 14 loudness 32833 non-null float64 15 mode 32833 non-null int64 16 speechiness 32833 non-null float64 17 acousticness 32833 non-null float64 18 instrumentalness 32833 non-null float64 19 liveness 32833 non-null float64 20 valence 32833 non-null float64 21 tempo 32833 non-null float64 22 duration_ms 32833 non-null int64 dtypes: float64(9), int64(4), object(10) memory usage: 5.8+ MB
4.Number of missing values per variable
df.isna().sum()
track_id 0 track_name 5 track_artist 5 track_popularity 0 track_album_id 0 track_album_name 5 track_album_release_date 0 playlist_name 0 playlist_id 0 playlist_genre 0 playlist_subgenre 0 danceability 0 energy 0 key 0 loudness 0 mode 0 speechiness 0 acousticness 0 instrumentalness 0 liveness 0 valence 0 tempo 0 duration_ms 0 dtype: int64
5.Number of unique values per variable
df.nunique()
track_id 28356 track_name 23449 track_artist 10692 track_popularity 101 track_album_id 22545 track_album_name 19743 track_album_release_date 4530 playlist_name 449 playlist_id 471 playlist_genre 6 playlist_subgenre 24 danceability 822 energy 952 key 12 loudness 10222 mode 2 speechiness 1270 acousticness 3731 instrumentalness 4729 liveness 1624 valence 1362 tempo 17684 duration_ms 19785 dtype: int64
Modify data¶
# modify 1.drop missing as shown in B.b.iii
df1 = df.dropna()
The impact of dropping missing data has been explored in EDA.
# modify 2.remove duplicate as stated in CMPINF 2100 - Final project feedback
df2 = df1.groupby(['track_id','track_album_id','playlist_id'],as_index=False).first()
df2.shape == df1.shape
False
# modify 3.transfer DATE variable as stated in CMPINF 2100 - Final project feedback
df2['track_album_release_date_dt'] = pd.to_datetime(df2.track_album_release_date,format='%Y-%m-%d', errors='coerce')
df2['release_year'] = df2.track_album_release_date_dt.dt.year
df2.columns
Index(['track_id', 'track_album_id', 'playlist_id', 'track_name',
'track_artist', 'track_popularity', 'track_album_name',
'track_album_release_date', 'playlist_name', 'playlist_genre',
'playlist_subgenre', 'danceability', 'energy', 'key', 'loudness',
'mode', 'speechiness', 'acousticness', 'instrumentalness', 'liveness',
'valence', 'tempo', 'duration_ms', 'track_album_release_date_dt',
'release_year'],
dtype='object')
# modify 4.transfer track_popularity in a regression problem as stated in CMPINF 2100 - Final project feedback
df2['track_pop_shift'] = np.where(df2.track_popularity == 100, df2.track_popularity - 0.1,df2.track_popularity)
df2['track_pop_shift'] = np.where(df2.track_popularity == 0, df2.track_popularity + 0.1,df2.track_popularity)
df2['track_pop_frac']=df2.track_pop_shift / 100
df2['y'] =np.log(df2.track_pop_frac / (1 - df2.track_pop_frac))
df2.columns
Index(['track_id', 'track_album_id', 'playlist_id', 'track_name',
'track_artist', 'track_popularity', 'track_album_name',
'track_album_release_date', 'playlist_name', 'playlist_genre',
'playlist_subgenre', 'danceability', 'energy', 'key', 'loudness',
'mode', 'speechiness', 'acousticness', 'instrumentalness', 'liveness',
'valence', 'tempo', 'duration_ms', 'track_album_release_date_dt',
'release_year', 'track_pop_shift', 'track_pop_frac', 'y'],
dtype='object')
# modify 5.split and choose part of the dataframe that I was interested in.
df3 = df2.loc[df2.release_year == 2004, :]
df3.info()
<class 'pandas.core.frame.DataFrame'> Index: 328 entries, 148 to 32181 Data columns (total 28 columns): # Column Non-Null Count Dtype --- ------ -------------- ----- 0 track_id 328 non-null object 1 track_album_id 328 non-null object 2 playlist_id 328 non-null object 3 track_name 328 non-null object 4 track_artist 328 non-null object 5 track_popularity 328 non-null int64 6 track_album_name 328 non-null object 7 track_album_release_date 328 non-null object 8 playlist_name 328 non-null object 9 playlist_genre 328 non-null object 10 playlist_subgenre 328 non-null object 11 danceability 328 non-null float64 12 energy 328 non-null float64 13 key 328 non-null int64 14 loudness 328 non-null float64 15 mode 328 non-null int64 16 speechiness 328 non-null float64 17 acousticness 328 non-null float64 18 instrumentalness 328 non-null float64 19 liveness 328 non-null float64 20 valence 328 non-null float64 21 tempo 328 non-null float64 22 duration_ms 328 non-null int64 23 track_album_release_date_dt 328 non-null datetime64[ns] 24 release_year 328 non-null float64 25 track_pop_shift 328 non-null float64 26 track_pop_frac 328 non-null float64 27 y 328 non-null float64 dtypes: datetime64[ns](1), float64(13), int64(4), object(10) memory usage: 74.3+ KB
df3.isna().sum()
track_id 0 track_album_id 0 playlist_id 0 track_name 0 track_artist 0 track_popularity 0 track_album_name 0 track_album_release_date 0 playlist_name 0 playlist_genre 0 playlist_subgenre 0 danceability 0 energy 0 key 0 loudness 0 mode 0 speechiness 0 acousticness 0 instrumentalness 0 liveness 0 valence 0 tempo 0 duration_ms 0 track_album_release_date_dt 0 release_year 0 track_pop_shift 0 track_pop_frac 0 y 0 dtype: int64
df3.nunique()
track_id 300 track_album_id 186 playlist_id 119 track_name 285 track_artist 157 track_popularity 70 track_album_name 166 track_album_release_date 77 playlist_name 116 playlist_genre 6 playlist_subgenre 20 danceability 238 energy 241 key 12 loudness 291 mode 2 speechiness 244 acousticness 268 instrumentalness 151 liveness 231 valence 244 tempo 293 duration_ms 288 track_album_release_date_dt 77 release_year 1 track_pop_shift 70 track_pop_frac 70 y 70 dtype: int64
df3.key.value_counts()
key 1 42 11 35 7 31 6 30 9 27 4 27 2 27 0 26 5 25 10 25 8 23 3 10 Name: count, dtype: int64
df3['mode'].value_counts()
mode 1 175 0 153 Name: count, dtype: int64
df3.playlist_genre.value_counts()
playlist_genre rap 85 r&b 84 rock 75 latin 63 pop 18 edm 3 Name: count, dtype: int64
df3.loudness
148 -8.753
853 -5.705
1045 -10.624
1046 -10.624
1083 -5.151
...
32114 -4.852
32134 -7.185
32143 -4.301
32160 -7.478
32181 -7.161
Name: loudness, Length: 328, dtype: float64
b.Visualization¶
1.Counts of categorical variables
#key
sns.catplot(data=df3,x='key',kind='count' )
plt.show()
C:\Users\russell\anaconda3\envs\cmpinf2100\lib\site-packages\seaborn\axisgrid.py:123: UserWarning: The figure layout has changed to tight self._figure.tight_layout(*args, **kwargs)
#mode
sns.catplot(data=df3,x='mode',kind='count' )
plt.show()
C:\Users\russell\anaconda3\envs\cmpinf2100\lib\site-packages\seaborn\axisgrid.py:123: UserWarning: The figure layout has changed to tight self._figure.tight_layout(*args, **kwargs)
#playlist_genre
sns.catplot(data=df3,x='playlist_genre',kind='count')
plt.show()
C:\Users\russell\anaconda3\envs\cmpinf2100\lib\site-packages\seaborn\axisgrid.py:123: UserWarning: The figure layout has changed to tight self._figure.tight_layout(*args, **kwargs)
2.Distributions of continuous variables
#track_popularity
fig, ax=plt.subplots()
sns.histplot(data=df3, x='track_popularity',bins=20,ax=ax)
plt.show()
#y
sns.displot(data=df3,x='y',bins=20,kind='hist',kde=True)
plt.show()
C:\Users\russell\anaconda3\envs\cmpinf2100\lib\site-packages\seaborn\axisgrid.py:123: UserWarning: The figure layout has changed to tight self._figure.tight_layout(*args, **kwargs)
#energy
sns.displot(data=df3, x='energy', bins=20,kind='hist', kde=True)
plt.show()
C:\Users\russell\anaconda3\envs\cmpinf2100\lib\site-packages\seaborn\axisgrid.py:123: UserWarning: The figure layout has changed to tight self._figure.tight_layout(*args, **kwargs)
#loudness
sns.displot(data=df3, x='loudness', bins=20,kind='hist', kde=True)
plt.show()
C:\Users\russell\anaconda3\envs\cmpinf2100\lib\site-packages\seaborn\axisgrid.py:123: UserWarning: The figure layout has changed to tight self._figure.tight_layout(*args, **kwargs)
#speechiness
sns.displot(data=df3, x='speechiness', bins=20,kind='hist', kde=True)
plt.show()
C:\Users\russell\anaconda3\envs\cmpinf2100\lib\site-packages\seaborn\axisgrid.py:123: UserWarning: The figure layout has changed to tight self._figure.tight_layout(*args, **kwargs)
#acousticness
sns.displot(data=df3, x='acousticness', bins=20,kind='hist', kde=True)
plt.show()
C:\Users\russell\anaconda3\envs\cmpinf2100\lib\site-packages\seaborn\axisgrid.py:123: UserWarning: The figure layout has changed to tight self._figure.tight_layout(*args, **kwargs)
#instrumentalness
sns.displot(data=df3, x='instrumentalness', bins=20,kind='hist', kde=True)
plt.show()
C:\Users\russell\anaconda3\envs\cmpinf2100\lib\site-packages\seaborn\axisgrid.py:123: UserWarning: The figure layout has changed to tight self._figure.tight_layout(*args, **kwargs)
#liveness
sns.displot(data=df3, x='liveness', bins=20,kind='hist', kde=True)
plt.show()
C:\Users\russell\anaconda3\envs\cmpinf2100\lib\site-packages\seaborn\axisgrid.py:123: UserWarning: The figure layout has changed to tight self._figure.tight_layout(*args, **kwargs)
#valence
sns.displot(data=df3, x='valence', bins=20,kind='hist', kde=True)
plt.show()
C:\Users\russell\anaconda3\envs\cmpinf2100\lib\site-packages\seaborn\axisgrid.py:123: UserWarning: The figure layout has changed to tight self._figure.tight_layout(*args, **kwargs)
#tempo
sns.displot(data=df3, x='tempo', bins=20,kind='hist', kde=True)
plt.show()
C:\Users\russell\anaconda3\envs\cmpinf2100\lib\site-packages\seaborn\axisgrid.py:123: UserWarning: The figure layout has changed to tight self._figure.tight_layout(*args, **kwargs)
#duration_ms
sns.displot(data=df3, x='duration_ms', bins=20,kind='hist', kde=True)
plt.show()
C:\Users\russell\anaconda3\envs\cmpinf2100\lib\site-packages\seaborn\axisgrid.py:123: UserWarning: The figure layout has changed to tight self._figure.tight_layout(*args, **kwargs)
3.Relationships between continuous variables
df3_viz = df3.drop(columns=['key','mode','release_year','track_pop_shift','track_pop_frac','track_popularity'])
# 1.study relationship between all continuous variables
fig, ax=plt.subplots()
sns.heatmap(data=df3_viz.corr(numeric_only=True),
vmin=-1,vmax=1,center=0,
cmap='coolwarm',
annot=True, annot_kws={'size':6},
ax=ax)
plt.show()
sns.lmplot(data= df3_viz, x='energy', y='y')
plt.show()
C:\Users\russell\anaconda3\envs\cmpinf2100\lib\site-packages\seaborn\axisgrid.py:123: UserWarning: The figure layout has changed to tight self._figure.tight_layout(*args, **kwargs)
sns.lmplot(data= df3_viz, x='loudness', y='y')
plt.show()
C:\Users\russell\anaconda3\envs\cmpinf2100\lib\site-packages\seaborn\axisgrid.py:123: UserWarning: The figure layout has changed to tight self._figure.tight_layout(*args, **kwargs)
sns.lmplot(data= df3_viz, x='danceability', y='y')
plt.show()
C:\Users\russell\anaconda3\envs\cmpinf2100\lib\site-packages\seaborn\axisgrid.py:123: UserWarning: The figure layout has changed to tight self._figure.tight_layout(*args, **kwargs)
sns.lmplot(data= df3_viz, x='tempo', y='y')
plt.show()
C:\Users\russell\anaconda3\envs\cmpinf2100\lib\site-packages\seaborn\axisgrid.py:123: UserWarning: The figure layout has changed to tight self._figure.tight_layout(*args, **kwargs)
sns.lmplot(data= df3_viz, x='valence', y='y')
plt.show()
C:\Users\russell\anaconda3\envs\cmpinf2100\lib\site-packages\seaborn\axisgrid.py:123: UserWarning: The figure layout has changed to tight self._figure.tight_layout(*args, **kwargs)
4.Summaries of the continuous variables grouped by categorical variables
the_groups = df3.playlist_genre.unique()
corr_per_group = df3.loc[:,['y','tempo','playlist_genre']].groupby(['playlist_genre']).corr()
fig, axs=plt.subplots(1,len(the_groups),figsize=(18,4),sharex=True,sharey=True)
for ix in range(len(the_groups)):
sns.heatmap(data=corr_per_group.loc[the_groups[ix]],
vmin=-1,vmax=1,center=0,
cmap='coolwarm',cbar=False,
annot=True,annot_kws={'size':10},
ax=axs[ix])
axs[ix].set_title('playlist_genre: %s' % the_groups[ix])
plt.show()
the_groups = df3.playlist_genre.unique()
corr_per_group = df3.loc[:,['y','danceability','playlist_genre']].groupby(['playlist_genre']).corr()
fig, axs=plt.subplots(1,len(the_groups),figsize=(18,4),sharex=True,sharey=True)
for ix in range(len(the_groups)):
sns.heatmap(data=corr_per_group.loc[the_groups[ix]],
vmin=-1,vmax=1,center=0,
cmap='coolwarm',cbar=False,
annot=True,annot_kws={'size':10},
ax=axs[ix])
axs[ix].set_title('playlist_genre: %s' % the_groups[ix])
plt.show()
the_groups = df3.playlist_genre.unique()
corr_per_group = df3.loc[:,['y','valence','playlist_genre']].groupby(['playlist_genre']).corr()
fig, axs=plt.subplots(1,len(the_groups),figsize=(18,4),sharex=True,sharey=True)
for ix in range(len(the_groups)):
sns.heatmap(data=corr_per_group.loc[the_groups[ix]],
vmin=-1,vmax=1,center=0,
cmap='coolwarm',cbar=False,
annot=True,annot_kws={'size':10},
ax=axs[ix])
axs[ix].set_title('playlist_genre: %s' % the_groups[ix])
plt.show()
5.Consider visualizing relationships between continuous inputs using scatter plots, pair plots, joint density plots, and correlation plots.
#whole correlation plot
fig, ax=plt.subplots()
sns.heatmap(data=df3_viz.corr(numeric_only=True),
vmin=-1,vmax=1,center=0,
cmap='coolwarm',
annot=True, annot_kws={'size':6},
ax=ax)
plt.show()
#enery-loudness:pair plot
sns.pairplot(data=df3_viz.loc[:,['energy','loudness']], diag_kws={'common_norm': False})
plt.show()
C:\Users\russell\anaconda3\envs\cmpinf2100\lib\site-packages\seaborn\axisgrid.py:123: UserWarning: The figure layout has changed to tight self._figure.tight_layout(*args, **kwargs)
##enery-loudness:joint density plot
sns.kdeplot(data=df3_viz, x='energy', y='loudness', fill=True, cmap='Blues')
plt.show()
#enery-loudness:scatter plot
sns.relplot(data=df3_viz,x='energy',y='loudness')
plt.show()
C:\Users\russell\anaconda3\envs\cmpinf2100\lib\site-packages\seaborn\axisgrid.py:123: UserWarning: The figure layout has changed to tight self._figure.tight_layout(*args, **kwargs)
Continuous variables energy and loudness are positively correlated.
#valence-daceability:scatter plot 0.49
sns.relplot(data=df3_viz,x='valence',y='danceability')
plt.show()
C:\Users\russell\anaconda3\envs\cmpinf2100\lib\site-packages\seaborn\axisgrid.py:123: UserWarning: The figure layout has changed to tight self._figure.tight_layout(*args, **kwargs)
#valence-danceability: pair plot
sns.pairplot(data=df3_viz.loc[:,['valence','danceability']], diag_kws={'common_norm': False})
plt.show()
C:\Users\russell\anaconda3\envs\cmpinf2100\lib\site-packages\seaborn\axisgrid.py:123: UserWarning: The figure layout has changed to tight self._figure.tight_layout(*args, **kwargs)
#valence-danceability:joint density plot
sns.kdeplot(data=df3_viz, x='valence', y='danceability', fill=True, cmap='Blues')
plt.show()
Continuous variables valence and danceability are positively correlated.
#acousticness-energy: scatter plots -0.54
sns.relplot(data=df3_viz,x='acousticness',y='energy')
plt.show()
C:\Users\russell\anaconda3\envs\cmpinf2100\lib\site-packages\seaborn\axisgrid.py:123: UserWarning: The figure layout has changed to tight self._figure.tight_layout(*args, **kwargs)
#acousticness-energy: pair plot
sns.pairplot(data=df3_viz.loc[:,['acousticness','energy']], diag_kws={'common_norm': False})
plt.show()
C:\Users\russell\anaconda3\envs\cmpinf2100\lib\site-packages\seaborn\axisgrid.py:123: UserWarning: The figure layout has changed to tight self._figure.tight_layout(*args, **kwargs)
#acousticness-energy:joint density plot
sns.kdeplot(data=df3_viz, x='acousticness', y='energy', fill=True, cmap='Blues')
plt.show()
Continuous variables acousticness and energy are negatively correlated.
6.If you are working on a regression problem: Visualize scatter plots between the continuous response and the continuous inputs. Summarize the response with boxplots for the unique values of the categorical inputs. Consider using residual plots to assess the fit of your regression model and identify any patterns in residuals.
#Visualize scatter plots between the continuous response and the continuous inputs.
sns.relplot(data=df3,x='valence',y='y')
plt.show()
C:\Users\russell\anaconda3\envs\cmpinf2100\lib\site-packages\seaborn\axisgrid.py:123: UserWarning: The figure layout has changed to tight self._figure.tight_layout(*args, **kwargs)
The main popular valence was focus on around 0.4 - 1.0.
sns.relplot(data=df3,x='tempo',y='y')
plt.show()
C:\Users\russell\anaconda3\envs\cmpinf2100\lib\site-packages\seaborn\axisgrid.py:123: UserWarning: The figure layout has changed to tight self._figure.tight_layout(*args, **kwargs)
The main popular tempo was around 80 to 120.
sns.relplot(data=df3,x='danceability',y='y')
plt.show()
C:\Users\russell\anaconda3\envs\cmpinf2100\lib\site-packages\seaborn\axisgrid.py:123: UserWarning: The figure layout has changed to tight self._figure.tight_layout(*args, **kwargs)
Generally, the higher danceability, the more popular.
sns.catplot(data=df3, x='playlist_genre', y='y',kind='box',
showmeans=True,
meanprops={'marker':'o',
'markerfacecolor':'white',
'markeredgecolor':'black'})
plt.show()
C:\Users\russell\anaconda3\envs\cmpinf2100\lib\site-packages\seaborn\axisgrid.py:123: UserWarning: The figure layout has changed to tight self._figure.tight_layout(*args, **kwargs)
Different categories of playlist_genre cause different popularity.
sns.catplot(data=df3, x='mode', y='y',kind='box',
showmeans=True,
meanprops={'marker':'o',
'markerfacecolor':'white',
'markeredgecolor':'black'})
plt.show()
C:\Users\russell\anaconda3\envs\cmpinf2100\lib\site-packages\seaborn\axisgrid.py:123: UserWarning: The figure layout has changed to tight self._figure.tight_layout(*args, **kwargs)
The popularity of the two modes are similar.
sns.catplot(data=df3, x='key', y='y',kind='box',
showmeans=True,
meanprops={'marker':'o',
'markerfacecolor':'white',
'markeredgecolor':'black'})
plt.show()
C:\Users\russell\anaconda3\envs\cmpinf2100\lib\site-packages\seaborn\axisgrid.py:123: UserWarning: The figure layout has changed to tight self._figure.tight_layout(*args, **kwargs)
Different keys have different popularity.
C. Clustering¶
from sklearn.preprocessing import StandardScaler
from sklearn.decomposition import PCA
df3_copy = df3.copy()
df3_copy['key'] = df3.key.astype('object')
df3_copy['mode'] = df3.key.astype('object')
df3_copy.info()
<class 'pandas.core.frame.DataFrame'> Index: 328 entries, 148 to 32181 Data columns (total 28 columns): # Column Non-Null Count Dtype --- ------ -------------- ----- 0 track_id 328 non-null object 1 track_album_id 328 non-null object 2 playlist_id 328 non-null object 3 track_name 328 non-null object 4 track_artist 328 non-null object 5 track_popularity 328 non-null int64 6 track_album_name 328 non-null object 7 track_album_release_date 328 non-null object 8 playlist_name 328 non-null object 9 playlist_genre 328 non-null object 10 playlist_subgenre 328 non-null object 11 danceability 328 non-null float64 12 energy 328 non-null float64 13 key 328 non-null object 14 loudness 328 non-null float64 15 mode 328 non-null object 16 speechiness 328 non-null float64 17 acousticness 328 non-null float64 18 instrumentalness 328 non-null float64 19 liveness 328 non-null float64 20 valence 328 non-null float64 21 tempo 328 non-null float64 22 duration_ms 328 non-null int64 23 track_album_release_date_dt 328 non-null datetime64[ns] 24 release_year 328 non-null float64 25 track_pop_shift 328 non-null float64 26 track_pop_frac 328 non-null float64 27 y 328 non-null float64 dtypes: datetime64[ns](1), float64(13), int64(2), object(12) memory usage: 82.4+ KB
df4 = df3_copy.drop(columns = ['track_pop_shift','track_pop_frac','track_album_release_date_dt','release_year'])
df4.info()
<class 'pandas.core.frame.DataFrame'> Index: 328 entries, 148 to 32181 Data columns (total 24 columns): # Column Non-Null Count Dtype --- ------ -------------- ----- 0 track_id 328 non-null object 1 track_album_id 328 non-null object 2 playlist_id 328 non-null object 3 track_name 328 non-null object 4 track_artist 328 non-null object 5 track_popularity 328 non-null int64 6 track_album_name 328 non-null object 7 track_album_release_date 328 non-null object 8 playlist_name 328 non-null object 9 playlist_genre 328 non-null object 10 playlist_subgenre 328 non-null object 11 danceability 328 non-null float64 12 energy 328 non-null float64 13 key 328 non-null object 14 loudness 328 non-null float64 15 mode 328 non-null object 16 speechiness 328 non-null float64 17 acousticness 328 non-null float64 18 instrumentalness 328 non-null float64 19 liveness 328 non-null float64 20 valence 328 non-null float64 21 tempo 328 non-null float64 22 duration_ms 328 non-null int64 23 y 328 non-null float64 dtypes: float64(10), int64(2), object(12) memory usage: 72.2+ KB
df4_features = df4.drop(columns=['y','track_popularity']).select_dtypes('number').copy()
df4_features.info()
<class 'pandas.core.frame.DataFrame'> Index: 328 entries, 148 to 32181 Data columns (total 10 columns): # Column Non-Null Count Dtype --- ------ -------------- ----- 0 danceability 328 non-null float64 1 energy 328 non-null float64 2 loudness 328 non-null float64 3 speechiness 328 non-null float64 4 acousticness 328 non-null float64 5 instrumentalness 328 non-null float64 6 liveness 328 non-null float64 7 valence 328 non-null float64 8 tempo 328 non-null float64 9 duration_ms 328 non-null int64 dtypes: float64(9), int64(1) memory usage: 36.3 KB
Judge whether the continuous variables have vastly different scales¶
sns.catplot(data=df4_features, kind='box', aspect=2)
plt.show()
C:\Users\russell\anaconda3\envs\cmpinf2100\lib\site-packages\seaborn\axisgrid.py:123: UserWarning: The figure layout has changed to tight self._figure.tight_layout(*args, **kwargs)
The continuous variables have vastly different scales, standardize the variables before clustering is needed.
Standardize continuous variables¶
Xdf4 = StandardScaler().fit_transform(df4_features)
xdf4 = pd.DataFrame(Xdf4, columns = df4_features.columns)
xdf4
| danceability | energy | loudness | speechiness | acousticness | instrumentalness | liveness | valence | tempo | duration_ms | |
|---|---|---|---|---|---|---|---|---|---|---|
| 0 | 0.228518 | -1.013955 | -0.767625 | -0.943733 | 1.303515 | -0.210842 | 2.187045 | -0.936229 | 0.028455 | 0.059377 |
| 1 | -0.453146 | 0.101941 | 0.358667 | -0.675642 | -0.611601 | -0.210842 | -0.679626 | -2.274030 | -0.933172 | 1.500060 |
| 2 | -0.703672 | -0.244371 | -1.458995 | 0.768784 | -0.788545 | 0.190331 | -0.309733 | -0.338908 | 1.453963 | -0.317121 |
| 3 | -0.703672 | -0.244371 | -1.458995 | 0.768784 | -0.788545 | 0.190331 | -0.309733 | -0.338908 | 1.453963 | -0.317121 |
| 4 | -0.849326 | 0.827549 | 0.563381 | 1.533530 | 0.529784 | -0.210842 | -0.053654 | 0.821178 | 0.976431 | 0.724371 |
| ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... |
| 323 | -1.519337 | 1.234329 | 0.673867 | -0.704857 | -0.812472 | -0.180955 | -0.466227 | -0.723958 | -0.779263 | -0.416850 |
| 324 | 1.114098 | 0.266852 | -0.188220 | -0.159224 | -0.694844 | 4.214700 | -0.082107 | -1.642153 | -0.719894 | -0.723362 |
| 325 | -0.820195 | 0.613165 | 0.877472 | -0.709154 | -0.503279 | -0.210842 | -0.437773 | 0.776749 | -0.974514 | -0.612667 |
| 326 | -2.218480 | -0.189401 | -0.296489 | -0.878429 | -0.798043 | -0.170035 | -0.614895 | -1.686582 | 0.251595 | 0.692511 |
| 327 | 0.717917 | -1.200854 | -0.179352 | -0.270928 | 0.860621 | -0.210842 | -0.395093 | 0.411445 | 0.094812 | -0.328870 |
328 rows × 10 columns
i.If you have a mix of continuous and categorical inputs in your data set, cluster the data based on the continuous inputs alone. After identifying the optimal number of clusters, compare the cluster assignments to unique values of several of the categorical inputs.
Hierarchy cluster¶
vi.If your EDA revealed that the continuous variables have vastly different scales, you must standardize the variables before clustering. If your EDA revealed that the continuous inputs are highly correlated, consider clustering using the original variables.
The EDA revealed that the continuous inputs (energy-loudness, acousticness-energy) are highly correlated, consider clustering using the original variables.
## hierarchy cluster
from scipy.cluster import hierarchy
hclust_ward = hierarchy.ward(Xdf4)
fig = plt.figure(figsize=(12,6))
dn = hierarchy.dendrogram(hclust_ward, no_labels=True)
plt.show()
# n=5
np.unique(hierarchy.cut_tree(hclust_ward, n_clusters = 5).ravel())
array([0, 1, 2, 3, 4])
xdf4['hclust_5'] = pd.Series(hierarchy.cut_tree(hclust_ward, n_clusters=5).ravel(), index=xdf4.index).astype('category')
xdf4.info()
<class 'pandas.core.frame.DataFrame'> RangeIndex: 328 entries, 0 to 327 Data columns (total 11 columns): # Column Non-Null Count Dtype --- ------ -------------- ----- 0 danceability 328 non-null float64 1 energy 328 non-null float64 2 loudness 328 non-null float64 3 speechiness 328 non-null float64 4 acousticness 328 non-null float64 5 instrumentalness 328 non-null float64 6 liveness 328 non-null float64 7 valence 328 non-null float64 8 tempo 328 non-null float64 9 duration_ms 328 non-null float64 10 hclust_5 328 non-null category dtypes: category(1), float64(10) memory usage: 26.2 KB
xdf4.hclust_5
0 0
1 1
2 2
3 2
4 2
..
323 1
324 4
325 3
326 1
327 0
Name: hclust_5, Length: 328, dtype: category
Categories (5, int32): [0, 1, 2, 3, 4]
df4_copy = df4.copy()
df4_copy['hclust_5']= pd.Series(hierarchy.cut_tree(hclust_ward, n_clusters=5).ravel(), index=df4_copy.index).astype('category')
df4_copy.hclust_5.value_counts()
hclust_5 3 122 0 91 1 63 2 43 4 9 Name: count, dtype: int64
Visualization¶
i.If you have a mix of continuous and categorical inputs in your data set, cluster the data based on the continuous inputs alone. After identifying the optimal number of clusters, compare the cluster assignments to unique values of several of the categorical inputs.
fig, ax=plt.subplots()
sns.heatmap(data=pd.crosstab(df4_copy.playlist_genre, df4_copy.hclust_5,margins=True),
annot=True, annot_kws={'fontsize':12},fmt='g',
cbar=False,ax=ax)
plt.show()
ii.Summarize the continuous inputs associated with each of the cluster assignments.
sns.relplot(data=df4_copy,x='energy',y='y',hue='hclust_5')
plt.show()
C:\Users\russell\anaconda3\envs\cmpinf2100\lib\site-packages\seaborn\axisgrid.py:123: UserWarning: The figure layout has changed to tight self._figure.tight_layout(*args, **kwargs)
sns.relplot(data=df4_copy,x='loudness',y='y',hue='hclust_5')
plt.show()
C:\Users\russell\anaconda3\envs\cmpinf2100\lib\site-packages\seaborn\axisgrid.py:123: UserWarning: The figure layout has changed to tight self._figure.tight_layout(*args, **kwargs)
sns.relplot(data=df4_copy,x='acousticness',y='y',hue='hclust_5')
plt.show()
C:\Users\russell\anaconda3\envs\cmpinf2100\lib\site-packages\seaborn\axisgrid.py:123: UserWarning: The figure layout has changed to tight self._figure.tight_layout(*args, **kwargs)
sns.catplot(data=df4_copy, x='hclust_5',y='playlist_genre',kind='box')
plt.show()
C:\Users\russell\anaconda3\envs\cmpinf2100\lib\site-packages\seaborn\axisgrid.py:123: UserWarning: The figure layout has changed to tight self._figure.tight_layout(*args, **kwargs)
sns.catplot(data=df4_copy, x='hclust_5',y='key',kind='box')
plt.show()
C:\Users\russell\anaconda3\envs\cmpinf2100\lib\site-packages\seaborn\axisgrid.py:123: UserWarning: The figure layout has changed to tight self._figure.tight_layout(*args, **kwargs)
D.Models: Fitting and Interpretation¶
Standardize¶
df5 = df4.copy().reset_index()
df5.info()
<class 'pandas.core.frame.DataFrame'> RangeIndex: 328 entries, 0 to 327 Data columns (total 25 columns): # Column Non-Null Count Dtype --- ------ -------------- ----- 0 index 328 non-null int64 1 track_id 328 non-null object 2 track_album_id 328 non-null object 3 playlist_id 328 non-null object 4 track_name 328 non-null object 5 track_artist 328 non-null object 6 track_popularity 328 non-null int64 7 track_album_name 328 non-null object 8 track_album_release_date 328 non-null object 9 playlist_name 328 non-null object 10 playlist_genre 328 non-null object 11 playlist_subgenre 328 non-null object 12 danceability 328 non-null float64 13 energy 328 non-null float64 14 key 328 non-null object 15 loudness 328 non-null float64 16 mode 328 non-null object 17 speechiness 328 non-null float64 18 acousticness 328 non-null float64 19 instrumentalness 328 non-null float64 20 liveness 328 non-null float64 21 valence 328 non-null float64 22 tempo 328 non-null float64 23 duration_ms 328 non-null int64 24 y 328 non-null float64 dtypes: float64(10), int64(3), object(12) memory usage: 64.2+ KB
from sklearn.preprocessing import StandardScaler
df5_co = df5.drop(columns = ['y','track_popularity','index']).select_dtypes('number').copy()
df5_ca = df5.loc[:,['key','mode','playlist_genre']]
df5_y = df5.loc[:,['y']]
df5_co
| danceability | energy | loudness | speechiness | acousticness | instrumentalness | liveness | valence | tempo | duration_ms | |
|---|---|---|---|---|---|---|---|---|---|---|
| 0 | 0.712 | 0.511 | -8.753 | 0.0297 | 0.397000 | 0.00000 | 0.4830 | 0.412 | 117.896 | 251760 |
| 1 | 0.595 | 0.714 | -5.705 | 0.0609 | 0.038100 | 0.00000 | 0.0800 | 0.141 | 88.449 | 325333 |
| 2 | 0.552 | 0.651 | -10.624 | 0.2290 | 0.004940 | 0.03490 | 0.1320 | 0.533 | 161.548 | 232533 |
| 3 | 0.552 | 0.651 | -10.624 | 0.2290 | 0.004940 | 0.03490 | 0.1320 | 0.533 | 161.548 | 232533 |
| 4 | 0.527 | 0.846 | -5.151 | 0.3180 | 0.252000 | 0.00000 | 0.1680 | 0.768 | 146.925 | 285720 |
| ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... |
| 323 | 0.412 | 0.920 | -4.852 | 0.0575 | 0.000456 | 0.00260 | 0.1100 | 0.455 | 93.162 | 227440 |
| 324 | 0.864 | 0.744 | -7.185 | 0.1210 | 0.022500 | 0.38500 | 0.1640 | 0.269 | 94.980 | 211787 |
| 325 | 0.532 | 0.807 | -4.301 | 0.0570 | 0.058400 | 0.00000 | 0.1140 | 0.759 | 87.183 | 217440 |
| 326 | 0.292 | 0.661 | -7.478 | 0.0373 | 0.003160 | 0.00355 | 0.0891 | 0.260 | 124.729 | 284093 |
| 327 | 0.796 | 0.477 | -7.161 | 0.1080 | 0.314000 | 0.00000 | 0.1200 | 0.685 | 119.928 | 231933 |
328 rows × 10 columns
Xdf5_co = StandardScaler().fit_transform(df5_co)
xdf5_co = pd.DataFrame(Xdf5_co,columns=df5_co.columns)
df5_new = pd.concat([xdf5_co, df5_ca, df5_y], ignore_index=False,axis=1)
df5_new
| danceability | energy | loudness | speechiness | acousticness | instrumentalness | liveness | valence | tempo | duration_ms | key | mode | playlist_genre | y | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 0 | 0.228518 | -1.013955 | -0.767625 | -0.943733 | 1.303515 | -0.210842 | 2.187045 | -0.936229 | 0.028455 | 0.059377 | 9 | 9 | r&b | -0.200671 |
| 1 | -0.453146 | 0.101941 | 0.358667 | -0.675642 | -0.611601 | -0.210842 | -0.679626 | -2.274030 | -0.933172 | 1.500060 | 1 | 1 | rock | 0.281851 |
| 2 | -0.703672 | -0.244371 | -1.458995 | 0.768784 | -0.788545 | 0.190331 | -0.309733 | -0.338908 | 1.453963 | -0.317121 | 4 | 4 | rock | 0.200671 |
| 3 | -0.703672 | -0.244371 | -1.458995 | 0.768784 | -0.788545 | 0.190331 | -0.309733 | -0.338908 | 1.453963 | -0.317121 | 4 | 4 | rock | 0.200671 |
| 4 | -0.849326 | 0.827549 | 0.563381 | 1.533530 | 0.529784 | -0.210842 | -0.053654 | 0.821178 | 0.976431 | 0.724371 | 4 | 4 | rap | -0.847298 |
| ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... |
| 323 | -1.519337 | 1.234329 | 0.673867 | -0.704857 | -0.812472 | -0.180955 | -0.466227 | -0.723958 | -0.779263 | -0.416850 | 9 | 9 | rock | 0.800119 |
| 324 | 1.114098 | 0.266852 | -0.188220 | -0.159224 | -0.694844 | 4.214700 | -0.082107 | -1.642153 | -0.719894 | -0.723362 | 2 | 2 | rap | -0.160343 |
| 325 | -0.820195 | 0.613165 | 0.877472 | -0.709154 | -0.503279 | -0.210842 | -0.437773 | 0.776749 | -0.974514 | -0.612667 | 2 | 2 | pop | 0.160343 |
| 326 | -2.218480 | -0.189401 | -0.296489 | -0.878429 | -0.798043 | -0.170035 | -0.614895 | -1.686582 | 0.251595 | 0.692511 | 11 | 11 | rock | -3.891820 |
| 327 | 0.717917 | -1.200854 | -0.179352 | -0.270928 | 0.860621 | -0.210842 | -0.395093 | 0.411445 | 0.094812 | -0.328870 | 1 | 1 | r&b | 0.663294 |
328 rows × 14 columns
Fit using statsmodels¶
import statsmodels.formula.api as smf
df5_new.info()
<class 'pandas.core.frame.DataFrame'> RangeIndex: 328 entries, 0 to 327 Data columns (total 14 columns): # Column Non-Null Count Dtype --- ------ -------------- ----- 0 danceability 328 non-null float64 1 energy 328 non-null float64 2 loudness 328 non-null float64 3 speechiness 328 non-null float64 4 acousticness 328 non-null float64 5 instrumentalness 328 non-null float64 6 liveness 328 non-null float64 7 valence 328 non-null float64 8 tempo 328 non-null float64 9 duration_ms 328 non-null float64 10 key 328 non-null object 11 mode 328 non-null object 12 playlist_genre 328 non-null object 13 y 328 non-null float64 dtypes: float64(11), object(3) memory usage: 36.0+ KB
df5_new.rename(columns={'key': 'xa1', 'mode': 'xa2', 'playlist_genre': 'xa3'}, inplace=True)
df5_new.rename(columns={'danceability': 'xo1', 'energy': 'xo2', 'loudness': 'xo3', 'speechiness':'xo4','acousticness':'xo5','instrumentalness':'xo6','liveness':'xo7','valence':'xo8','tempo':'xo9','duration_ms':'xo10'}, inplace=True)
df5_new.info()
<class 'pandas.core.frame.DataFrame'> RangeIndex: 328 entries, 0 to 327 Data columns (total 14 columns): # Column Non-Null Count Dtype --- ------ -------------- ----- 0 xo1 328 non-null float64 1 xo2 328 non-null float64 2 xo3 328 non-null float64 3 xo4 328 non-null float64 4 xo5 328 non-null float64 5 xo6 328 non-null float64 6 xo7 328 non-null float64 7 xo8 328 non-null float64 8 xo9 328 non-null float64 9 xo10 328 non-null float64 10 xa1 328 non-null object 11 xa2 328 non-null object 12 xa3 328 non-null object 13 y 328 non-null float64 dtypes: float64(11), object(3) memory usage: 36.0+ KB
multiple models:
6 must consider; 2 additional with all variables.
formula_list = ['y ~ 1',
'y ~ xa1 + xa2 + xa3',
'y ~ xo1 + xo2 + xo3 + xo4 + xo5 + xo6 + xo7 + xo8 + xo9 + xo10',
'y ~ xa1 + xa2 + xa3 + xo1 + xo2 + xo3 + xo4 + xo5 + xo6 + xo7 + xo8 + xo9 + xo10',
'y ~ (xo1 + xo2 + xo3 + xo4 + xo5 + xo6 + xo7 + xo8 + xo9 + xo10)** 2',
'y ~ (xo1 + xo2 + xo3 + xo4 + xo5 + xo6 + xo7 + xo8 + xo9 + xo10) * (xa1 + xa2 + xa3) ',
'y ~ (xa1 + xa2 + xa3) + (xo1 + xo2 + xo3 + xo4 + xo5 + xo6 + xo7 + xo8 + xo9 + xo10 + np.power(xo1,2) + np.power(xo2,2) + np.power(xo3,2) + np.power(xo4,2) + np.power(xo5,2) + np.power(xo6,2) + np.power(xo7,2) + np.power(xo8,2) + np.power(xo9,2) + np.power(xo10,2))',
'y ~ (xa1 + xa2 + xa3) * (xo1 + xo2 + xo3 + xo4 + xo5 + xo6 + xo7 + xo8 + xo9 + xo10 + np.power(xo1,2) + np.power(xo2,2) + np.power(xo3,2) + np.power(xo4,2) + np.power(xo5,2) + np.power(xo6,2) + np.power(xo7,2) + np.power(xo8,2) + np.power(xo9,2) + np.power(xo10,2))']
formula_list[7]
'y ~ (xa1 + xa2 + xa3) * (xo1 + xo2 + xo3 + xo4 + xo5 + xo6 + xo7 + xo8 + xo9 + xo10 + np.power(xo1,2) + np.power(xo2,2) + np.power(xo3,2) + np.power(xo4,2) + np.power(xo5,2) + np.power(xo6,2) + np.power(xo7,2) + np.power(xo8,2) + np.power(xo9,2) + np.power(xo10,2))'
For each model that you fit you must answer the following questions associated with the regression coefficients:
i.How many coefficients were estimated?
ii.How many coefficients (and thus features) are STATISTICALLY SIGNIFICANT using commonly accepted thresholds?
iii.WHICH coefficients (and thus features) are STATISTICALLY SIGNIFICANT and what are the coefficients POSITIVE or NEGATIVE for those features?
iv.Which two STATISTICALLY SIGNIFICANT coefficients (and thus features) have the highest MAGNITUDE coefficient values?
fit_00 = smf.ols(formula=formula_list[0], data=df5_new).fit()
fit_00.params
Intercept -1.369664 dtype: float64
fit_00.bse
Intercept 0.13931 dtype: float64
fit_00.pvalues < 0.05
Intercept True dtype: bool
print(fit_00.summary())
OLS Regression Results
==============================================================================
Dep. Variable: y R-squared: 0.000
Model: OLS Adj. R-squared: 0.000
Method: Least Squares F-statistic: nan
Date: Thu, 12 Dec 2024 Prob (F-statistic): nan
Time: 20:30:45 Log-Likelihood: -768.46
No. Observations: 328 AIC: 1539.
Df Residuals: 327 BIC: 1543.
Df Model: 0
Covariance Type: nonrobust
==============================================================================
coef std err t P>|t| [0.025 0.975]
------------------------------------------------------------------------------
Intercept -1.3697 0.139 -9.832 0.000 -1.644 -1.096
==============================================================================
Omnibus: 57.029 Durbin-Watson: 1.899
Prob(Omnibus): 0.000 Jarque-Bera (JB): 84.151
Skew: -1.236 Prob(JB): 5.33e-19
Kurtosis: 3.223 Cond. No. 1.00
==============================================================================
Notes:
[1] Standard Errors assume that the covariance matrix of the errors is correctly specified.
fit_00:
i:only intercept was estimated.
ii:1, the intercept.
iii:intercept(negative).
iv: not available.
fit_01 = smf.ols(formula=formula_list[1], data=df5_new).fit()
print(fit_01.summary())
OLS Regression Results
==============================================================================
Dep. Variable: y R-squared: 0.063
Model: OLS Adj. R-squared: 0.015
Method: Least Squares F-statistic: 1.315
Date: Thu, 12 Dec 2024 Prob (F-statistic): 0.186
Time: 20:30:49 Log-Likelihood: -757.73
No. Observations: 328 AIC: 1549.
Df Residuals: 311 BIC: 1614.
Df Model: 16
Covariance Type: nonrobust
================================================================================
coef std err t P>|t| [0.025 0.975]
--------------------------------------------------------------------------------
Intercept -0.3099 1.553 -0.200 0.842 -3.365 2.745
xa1[T.1] 0.0823 0.317 0.260 0.795 -0.542 0.706
xa1[T.2] 0.0924 0.347 0.266 0.790 -0.590 0.775
xa1[T.3] 0.5619 0.470 1.195 0.233 -0.364 1.487
xa1[T.4] 0.2636 0.346 0.762 0.447 -0.417 0.944
xa1[T.5] 0.0549 0.357 0.154 0.878 -0.648 0.758
xa1[T.6] -0.1591 0.337 -0.472 0.637 -0.822 0.504
xa1[T.7] 0.2903 0.340 0.853 0.394 -0.379 0.960
xa1[T.8] -0.0565 0.363 -0.156 0.876 -0.770 0.657
xa1[T.9] 0.1643 0.345 0.476 0.634 -0.515 0.843
xa1[T.10] -0.1004 0.358 -0.281 0.779 -0.804 0.604
xa1[T.11] -0.1446 0.329 -0.439 0.661 -0.792 0.503
xa2[T.1] 0.0823 0.317 0.260 0.795 -0.542 0.706
xa2[T.2] 0.0924 0.347 0.266 0.790 -0.590 0.775
xa2[T.3] 0.5619 0.470 1.195 0.233 -0.364 1.487
xa2[T.4] 0.2636 0.346 0.762 0.447 -0.417 0.944
xa2[T.5] 0.0549 0.357 0.154 0.878 -0.648 0.758
xa2[T.6] -0.1591 0.337 -0.472 0.637 -0.822 0.504
xa2[T.7] 0.2903 0.340 0.853 0.394 -0.379 0.960
xa2[T.8] -0.0565 0.363 -0.156 0.876 -0.770 0.657
xa2[T.9] 0.1643 0.345 0.476 0.634 -0.515 0.843
xa2[T.10] -0.1004 0.358 -0.281 0.779 -0.804 0.604
xa2[T.11] -0.1446 0.329 -0.439 0.661 -0.792 0.503
xa3[T.latin] -1.7443 1.499 -1.163 0.246 -4.694 1.206
xa3[T.pop] 0.0933 1.586 0.059 0.953 -3.027 3.213
xa3[T.r&b] -1.2727 1.487 -0.856 0.393 -4.199 1.654
xa3[T.rap] -1.5538 1.489 -1.043 0.298 -4.484 1.377
xa3[T.rock] -0.5348 1.494 -0.358 0.721 -3.474 2.405
==============================================================================
Omnibus: 49.139 Durbin-Watson: 1.859
Prob(Omnibus): 0.000 Jarque-Bera (JB): 68.777
Skew: -1.120 Prob(JB): 1.16e-15
Kurtosis: 3.124 Cond. No. 1.18e+16
==============================================================================
Notes:
[1] Standard Errors assume that the covariance matrix of the errors is correctly specified.
[2] The smallest eigenvalue is 3.27e-30. This might indicate that there are
strong multicollinearity problems or that the design matrix is singular.
len(fit_01.params)
28
fit_01.pvalues < 0.05
Intercept False xa1[T.1] False xa1[T.2] False xa1[T.3] False xa1[T.4] False xa1[T.5] False xa1[T.6] False xa1[T.7] False xa1[T.8] False xa1[T.9] False xa1[T.10] False xa1[T.11] False xa2[T.1] False xa2[T.2] False xa2[T.3] False xa2[T.4] False xa2[T.5] False xa2[T.6] False xa2[T.7] False xa2[T.8] False xa2[T.9] False xa2[T.10] False xa2[T.11] False xa3[T.latin] False xa3[T.pop] False xa3[T.r&b] False xa3[T.rap] False xa3[T.rock] False dtype: bool
fit_01:
i: 28.
ii:none.
iii:none.
iv:not avaiable(NA).
fit_02 = smf.ols(formula=formula_list[2], data=df5_new).fit()
print(fit_02.summary())
OLS Regression Results
==============================================================================
Dep. Variable: y R-squared: 0.096
Model: OLS Adj. R-squared: 0.067
Method: Least Squares F-statistic: 3.355
Date: Thu, 12 Dec 2024 Prob (F-statistic): 0.000359
Time: 20:30:52 Log-Likelihood: -751.96
No. Observations: 328 AIC: 1526.
Df Residuals: 317 BIC: 1568.
Df Model: 10
Covariance Type: nonrobust
==============================================================================
coef std err t P>|t| [0.025 0.975]
------------------------------------------------------------------------------
Intercept -1.3697 0.135 -10.180 0.000 -1.634 -1.105
xo1 -0.2546 0.175 -1.455 0.147 -0.599 0.090
xo2 -0.5528 0.242 -2.282 0.023 -1.029 -0.076
xo3 0.5278 0.211 2.506 0.013 0.113 0.942
xo4 -0.2942 0.145 -2.029 0.043 -0.580 -0.009
xo5 -0.0864 0.165 -0.525 0.600 -0.410 0.237
xo6 -0.0909 0.142 -0.640 0.522 -0.370 0.188
xo7 -0.1685 0.139 -1.208 0.228 -0.443 0.106
xo8 -0.0739 0.171 -0.432 0.666 -0.410 0.262
xo9 0.3926 0.147 2.678 0.008 0.104 0.681
xo10 -0.3433 0.140 -2.455 0.015 -0.618 -0.068
==============================================================================
Omnibus: 52.797 Durbin-Watson: 1.888
Prob(Omnibus): 0.000 Jarque-Bera (JB): 75.011
Skew: -1.162 Prob(JB): 5.15e-17
Kurtosis: 3.292 Cond. No. 3.54
==============================================================================
Notes:
[1] Standard Errors assume that the covariance matrix of the errors is correctly specified.
len(fit_02.params)
11
fit_02.params
Intercept -1.369664 xo1 -0.254649 xo2 -0.552810 xo3 0.527836 xo4 -0.294229 xo5 -0.086426 xo6 -0.090922 xo7 -0.168461 xo8 -0.073912 xo9 0.392640 xo10 -0.343292 dtype: float64
fit_02.pvalues < 0.05
Intercept True xo1 False xo2 True xo3 True xo4 True xo5 False xo6 False xo7 False xo8 False xo9 True xo10 True dtype: bool
fit_02:
i:11.
ii:6 coefficients:intercept, xo2, xo3, xo4, xo9, xo10.
iii:as follows:
Intercept:-1.369664(negative).
xo2:-0.552810(negative).
xo3:0.527836(positive).
xo4:-0.294229(negative).
xo9:0.392640(positive).
xo10:-0.343292(negative).
iv:intercept and xo2.
fit_03 = smf.ols(formula=formula_list[3], data=df5_new).fit()
fit_03.params
Intercept 0.307332 xa1[T.1] 0.006082 xa1[T.2] 0.104477 xa1[T.3] 0.323718 xa1[T.4] 0.186294 xa1[T.5] 0.078887 xa1[T.6] -0.013384 xa1[T.7] 0.438080 xa1[T.8] -0.072181 xa1[T.9] 0.084579 xa1[T.10] -0.123314 xa1[T.11] -0.123451 xa2[T.1] 0.006082 xa2[T.2] 0.104477 xa2[T.3] 0.323718 xa2[T.4] 0.186294 xa2[T.5] 0.078887 xa2[T.6] -0.013384 xa2[T.7] 0.438080 xa2[T.8] -0.072181 xa2[T.9] 0.084579 xa2[T.10] -0.123314 xa2[T.11] -0.123451 xa3[T.latin] -2.176195 xa3[T.pop] -0.492876 xa3[T.r&b] -2.009369 xa3[T.rap] -2.174006 xa3[T.rock] -1.199117 xo1 -0.115966 xo2 -0.696567 xo3 0.636179 xo4 -0.098570 xo5 -0.004158 xo6 -0.105540 xo7 -0.146511 xo8 -0.037778 xo9 0.325810 xo10 -0.380575 dtype: float64
len(fit_03.params)
38
fit_03.pvalues < 0.05
Intercept False xa1[T.1] False xa1[T.2] False xa1[T.3] False xa1[T.4] False xa1[T.5] False xa1[T.6] False xa1[T.7] False xa1[T.8] False xa1[T.9] False xa1[T.10] False xa1[T.11] False xa2[T.1] False xa2[T.2] False xa2[T.3] False xa2[T.4] False xa2[T.5] False xa2[T.6] False xa2[T.7] False xa2[T.8] False xa2[T.9] False xa2[T.10] False xa2[T.11] False xa3[T.latin] False xa3[T.pop] False xa3[T.r&b] False xa3[T.rap] False xa3[T.rock] False xo1 False xo2 True xo3 True xo4 False xo5 False xo6 False xo7 False xo8 False xo9 True xo10 True dtype: bool
fit_03:
i:38.
ii:4. xo2, xo3, xo9, xo10.
iii: as follows:
xo2:-0.696567(negative),
xo3:0.636179(positive),
xo9:0.325810(positive),
xo10:-0.380575(negative).
iv:xo2 and xo3.
fit_04 = smf.ols(formula=formula_list[4], data=df5_new).fit()
print(fit_04.summary())
OLS Regression Results
==============================================================================
Dep. Variable: y R-squared: 0.216
Model: OLS Adj. R-squared: 0.058
Method: Least Squares F-statistic: 1.366
Date: Thu, 12 Dec 2024 Prob (F-statistic): 0.0562
Time: 20:30:59 Log-Likelihood: -728.45
No. Observations: 328 AIC: 1569.
Df Residuals: 272 BIC: 1781.
Df Model: 55
Covariance Type: nonrobust
==============================================================================
coef std err t P>|t| [0.025 0.975]
------------------------------------------------------------------------------
Intercept -1.1970 0.275 -4.358 0.000 -1.738 -0.656
xo1 -0.2042 0.204 -1.001 0.317 -0.606 0.197
xo2 -0.4929 0.289 -1.705 0.089 -1.062 0.076
xo3 0.4869 0.295 1.651 0.100 -0.094 1.068
xo4 0.0293 0.248 0.118 0.906 -0.460 0.518
xo5 -0.0506 0.285 -0.178 0.859 -0.611 0.510
xo6 0.1735 0.849 0.204 0.838 -1.498 1.845
xo7 -0.2396 0.214 -1.120 0.264 -0.661 0.181
xo8 -0.2073 0.210 -0.989 0.324 -0.620 0.205
xo9 0.2903 0.210 1.381 0.169 -0.124 0.704
xo10 -0.2690 0.190 -1.417 0.158 -0.643 0.105
xo1:xo2 0.4540 0.326 1.395 0.164 -0.187 1.095
xo1:xo3 -0.0788 0.318 -0.248 0.804 -0.705 0.547
xo1:xo4 -0.0012 0.234 -0.005 0.996 -0.462 0.460
xo1:xo5 -0.0613 0.229 -0.268 0.789 -0.512 0.390
xo1:xo6 -0.2265 0.307 -0.737 0.462 -0.832 0.379
xo1:xo7 -0.1282 0.227 -0.565 0.572 -0.575 0.318
xo1:xo8 -0.1928 0.197 -0.979 0.329 -0.581 0.195
xo1:xo9 0.1599 0.218 0.734 0.463 -0.269 0.589
xo1:xo10 0.2733 0.187 1.463 0.145 -0.095 0.641
xo2:xo3 0.2228 0.208 1.071 0.285 -0.187 0.632
xo2:xo4 -0.5676 0.339 -1.675 0.095 -1.235 0.100
xo2:xo5 0.2094 0.282 0.744 0.458 -0.345 0.764
xo2:xo6 -0.2467 0.596 -0.414 0.679 -1.420 0.927
xo2:xo7 0.0437 0.284 0.154 0.878 -0.515 0.603
xo2:xo8 -0.6461 0.317 -2.041 0.042 -1.269 -0.023
xo2:xo9 0.3442 0.324 1.061 0.290 -0.294 0.983
xo2:xo10 0.4922 0.321 1.532 0.127 -0.140 1.125
xo3:xo4 0.0041 0.319 0.013 0.990 -0.624 0.632
xo3:xo5 0.0967 0.271 0.357 0.721 -0.436 0.630
xo3:xo6 -0.3498 0.951 -0.368 0.713 -2.222 1.523
xo3:xo7 0.0091 0.270 0.034 0.973 -0.523 0.542
xo3:xo8 0.2566 0.315 0.814 0.417 -0.364 0.877
xo3:xo9 -0.1653 0.297 -0.556 0.579 -0.751 0.420
xo3:xo10 -0.3772 0.269 -1.400 0.163 -0.908 0.153
xo4:xo5 0.1131 0.220 0.514 0.608 -0.320 0.546
xo4:xo6 1.8604 1.046 1.778 0.076 -0.199 3.920
xo4:xo7 0.0134 0.194 0.069 0.945 -0.368 0.395
xo4:xo8 0.3668 0.235 1.558 0.120 -0.097 0.830
xo4:xo9 0.1801 0.161 1.118 0.265 -0.137 0.497
xo4:xo10 0.1621 0.220 0.736 0.462 -0.271 0.595
xo5:xo6 -1.6898 1.140 -1.482 0.139 -3.934 0.554
xo5:xo7 0.1995 0.239 0.833 0.405 -0.272 0.671
xo5:xo8 -0.2363 0.277 -0.853 0.394 -0.782 0.309
xo5:xo9 -0.0646 0.217 -0.297 0.767 -0.492 0.363
xo5:xo10 0.3888 0.241 1.614 0.108 -0.085 0.863
xo6:xo7 -0.3704 0.666 -0.556 0.579 -1.682 0.942
xo6:xo8 0.4026 0.329 1.225 0.222 -0.245 1.050
xo6:xo9 0.1443 0.334 0.432 0.666 -0.513 0.802
xo6:xo10 -0.0168 0.328 -0.051 0.959 -0.663 0.630
xo7:xo8 0.2566 0.213 1.205 0.229 -0.163 0.676
xo7:xo9 0.2040 0.187 1.089 0.277 -0.165 0.573
xo7:xo10 0.4077 0.196 2.083 0.038 0.022 0.793
xo8:xo9 -0.4616 0.237 -1.946 0.053 -0.929 0.005
xo8:xo10 -0.2903 0.190 -1.525 0.128 -0.665 0.084
xo9:xo10 -0.0851 0.199 -0.427 0.670 -0.478 0.308
==============================================================================
Omnibus: 47.239 Durbin-Watson: 2.008
Prob(Omnibus): 0.000 Jarque-Bera (JB): 64.380
Skew: -1.076 Prob(JB): 1.05e-14
Kurtosis: 3.277 Cond. No. 36.6
==============================================================================
Notes:
[1] Standard Errors assume that the covariance matrix of the errors is correctly specified.
len(fit_04.params)
56
fit_04.params
Intercept -1.197021 xo1 -0.204179 xo2 -0.492885 xo3 0.486855 xo4 0.029285 xo5 -0.050610 xo6 0.173452 xo7 -0.239649 xo8 -0.207340 xo9 0.290324 xo10 -0.268953 xo1:xo2 0.453979 xo1:xo3 -0.078793 xo1:xo4 -0.001225 xo1:xo5 -0.061348 xo1:xo6 -0.226485 xo1:xo7 -0.128195 xo1:xo8 -0.192824 xo1:xo9 0.159948 xo1:xo10 0.273344 xo2:xo3 0.222802 xo2:xo4 -0.567616 xo2:xo5 0.209396 xo2:xo6 -0.246665 xo2:xo7 0.043743 xo2:xo8 -0.646115 xo2:xo9 0.344194 xo2:xo10 0.492238 xo3:xo4 0.004065 xo3:xo5 0.096674 xo3:xo6 -0.349785 xo3:xo7 0.009137 xo3:xo8 0.256604 xo3:xo9 -0.165343 xo3:xo10 -0.377246 xo4:xo5 0.113076 xo4:xo6 1.860426 xo4:xo7 0.013426 xo4:xo8 0.366802 xo4:xo9 0.180124 xo4:xo10 0.162056 xo5:xo6 -1.689833 xo5:xo7 0.199484 xo5:xo8 -0.236321 xo5:xo9 -0.064562 xo5:xo10 0.388823 xo6:xo7 -0.370420 xo6:xo8 0.402610 xo6:xo9 0.144307 xo6:xo10 -0.016841 xo7:xo8 0.256605 xo7:xo9 0.203996 xo7:xo10 0.407678 xo8:xo9 -0.461601 xo8:xo10 -0.290268 xo9:xo10 -0.085109 dtype: float64
fit_04.pvalues < 0.05
Intercept True xo1 False xo2 False xo3 False xo4 False xo5 False xo6 False xo7 False xo8 False xo9 False xo10 False xo1:xo2 False xo1:xo3 False xo1:xo4 False xo1:xo5 False xo1:xo6 False xo1:xo7 False xo1:xo8 False xo1:xo9 False xo1:xo10 False xo2:xo3 False xo2:xo4 False xo2:xo5 False xo2:xo6 False xo2:xo7 False xo2:xo8 True xo2:xo9 False xo2:xo10 False xo3:xo4 False xo3:xo5 False xo3:xo6 False xo3:xo7 False xo3:xo8 False xo3:xo9 False xo3:xo10 False xo4:xo5 False xo4:xo6 False xo4:xo7 False xo4:xo8 False xo4:xo9 False xo4:xo10 False xo5:xo6 False xo5:xo7 False xo5:xo8 False xo5:xo9 False xo5:xo10 False xo6:xo7 False xo6:xo8 False xo6:xo9 False xo6:xo10 False xo7:xo8 False xo7:xo9 False xo7:xo10 True xo8:xo9 False xo8:xo10 False xo9:xo10 False dtype: bool
fit_04:
i:56.
ii: 3. intercept, xo2:xo8, xo7:xo10.
iii: as follows:
intercept:-1.197021(negative),
xo2:xo8 -0.646115(negative),
xo7:xo10 0.407678 (negative).
iv: intercept and xo2:xo8.
fit_05 = smf.ols(formula=formula_list[5], data=df5_new).fit()
print(fit_05.summary())
OLS Regression Results
==============================================================================
Dep. Variable: y R-squared: 0.539
Model: OLS Adj. R-squared: 0.002
Method: Least Squares F-statistic: 1.003
Date: Thu, 12 Dec 2024 Prob (F-statistic): 0.494
Time: 20:31:03 Log-Likelihood: -641.49
No. Observations: 328 AIC: 1637.
Df Residuals: 151 BIC: 2308.
Df Model: 176
Covariance Type: nonrobust
=====================================================================================
coef std err t P>|t| [0.025 0.975]
-------------------------------------------------------------------------------------
Intercept 1.1496 1.420 0.810 0.419 -1.656 3.955
xa1[T.1] -1.6690 3.124 -0.534 0.594 -7.842 4.504
xa1[T.2] -0.5599 0.724 -0.773 0.441 -1.991 0.871
xa1[T.3] -0.0039 1.086 -0.004 0.997 -2.149 2.141
xa1[T.4] 0.8747 0.841 1.040 0.300 -0.787 2.536
xa1[T.5] -0.8346 0.637 -1.309 0.192 -2.094 0.425
xa1[T.6] -0.4763 0.709 -0.671 0.503 -1.878 0.925
xa1[T.7] -0.0072 0.808 -0.009 0.993 -1.604 1.590
xa1[T.8] -0.9162 0.587 -1.562 0.120 -2.075 0.243
xa1[T.9] -0.3921 0.683 -0.574 0.567 -1.742 0.958
xa1[T.10] -0.5561 0.622 -0.894 0.373 -1.785 0.672
xa1[T.11] 0.9377 1.127 0.832 0.407 -1.290 3.165
xa2[T.1] -1.6690 3.124 -0.534 0.594 -7.842 4.504
xa2[T.2] -0.5599 0.724 -0.773 0.441 -1.991 0.871
xa2[T.3] -0.0039 1.086 -0.004 0.997 -2.149 2.141
xa2[T.4] 0.8747 0.841 1.040 0.300 -0.787 2.536
xa2[T.5] -0.8346 0.637 -1.309 0.192 -2.094 0.425
xa2[T.6] -0.4763 0.709 -0.671 0.503 -1.878 0.925
xa2[T.7] -0.0072 0.808 -0.009 0.993 -1.604 1.590
xa2[T.8] -0.9162 0.587 -1.562 0.120 -2.075 0.243
xa2[T.9] -0.3921 0.683 -0.574 0.567 -1.742 0.958
xa2[T.10] -0.5561 0.622 -0.894 0.373 -1.785 0.672
xa2[T.11] 0.9377 1.127 0.832 0.407 -1.290 3.165
xa3[T.latin] -2.5951 1.404 -1.849 0.066 -5.368 0.178
xa3[T.pop] 1.7679 4.601 0.384 0.701 -7.323 10.859
xa3[T.r&b] -4.5357 1.732 -2.619 0.010 -7.957 -1.114
xa3[T.rap] -0.3967 1.412 -0.281 0.779 -3.187 2.393
xa3[T.rock] -1.0172 1.542 -0.659 0.511 -4.065 2.030
xo1 0.1662 1.203 0.138 0.890 -2.210 2.543
xo1:xa1[T.1] 0.6747 0.698 0.966 0.336 -0.705 2.055
xo1:xa1[T.2] 0.3642 0.663 0.549 0.584 -0.946 1.674
xo1:xa1[T.3] 0.6391 0.937 0.682 0.496 -1.212 2.490
xo1:xa1[T.4] 0.5959 0.739 0.807 0.421 -0.864 2.055
xo1:xa1[T.5] -0.5401 1.019 -0.530 0.597 -2.554 1.474
xo1:xa1[T.6] 0.2779 0.859 0.324 0.747 -1.419 1.975
xo1:xa1[T.7] -0.0808 0.773 -0.105 0.917 -1.608 1.446
xo1:xa1[T.8] -0.5415 0.870 -0.622 0.535 -2.261 1.178
xo1:xa1[T.9] 0.1072 0.831 0.129 0.898 -1.535 1.749
xo1:xa1[T.10] 0.5519 0.776 0.711 0.478 -0.982 2.086
xo1:xa1[T.11] 0.6993 0.699 1.001 0.319 -0.681 2.080
xo1:xa2[T.1] 0.6747 0.698 0.966 0.336 -0.705 2.055
xo1:xa2[T.2] 0.3642 0.663 0.549 0.584 -0.946 1.674
xo1:xa2[T.3] 0.6391 0.937 0.682 0.496 -1.212 2.490
xo1:xa2[T.4] 0.5959 0.739 0.807 0.421 -0.864 2.055
xo1:xa2[T.5] -0.5401 1.019 -0.530 0.597 -2.554 1.474
xo1:xa2[T.6] 0.2779 0.859 0.324 0.747 -1.419 1.975
xo1:xa2[T.7] -0.0808 0.773 -0.105 0.917 -1.608 1.446
xo1:xa2[T.8] -0.5415 0.870 -0.622 0.535 -2.261 1.178
xo1:xa2[T.9] 0.1072 0.831 0.129 0.898 -1.535 1.749
xo1:xa2[T.10] 0.5519 0.776 0.711 0.478 -0.982 2.086
xo1:xa2[T.11] 0.6993 0.699 1.001 0.319 -0.681 2.080
xo1:xa3[T.latin] -0.5723 1.042 -0.549 0.584 -2.632 1.487
xo1:xa3[T.pop] -1.2478 1.601 -0.779 0.437 -4.411 1.915
xo1:xa3[T.r&b] -0.0849 0.976 -0.087 0.931 -2.013 1.843
xo1:xa3[T.rap] -0.8640 0.864 -1.000 0.319 -2.571 0.843
xo1:xa3[T.rock] -0.7727 1.029 -0.751 0.454 -2.805 1.259
xo2 -2.4204 1.777 -1.362 0.175 -5.932 1.091
xo2:xa1[T.1] 2.0213 0.939 2.152 0.033 0.166 3.877
xo2:xa1[T.2] 1.2757 0.977 1.305 0.194 -0.656 3.207
xo2:xa1[T.3] -0.2179 3.941 -0.055 0.956 -8.004 7.568
xo2:xa1[T.4] 1.8662 1.010 1.847 0.067 -0.130 3.862
xo2:xa1[T.5] 0.9494 1.053 0.902 0.369 -1.131 3.030
xo2:xa1[T.6] 0.1386 1.169 0.119 0.906 -2.172 2.449
xo2:xa1[T.7] 0.5402 1.013 0.534 0.594 -1.460 2.541
xo2:xa1[T.8] 1.3029 1.152 1.131 0.260 -0.973 3.579
xo2:xa1[T.9] 1.9833 1.083 1.832 0.069 -0.156 4.122
xo2:xa1[T.10] 3.4405 1.312 2.622 0.010 0.848 6.033
xo2:xa1[T.11] 1.4452 1.000 1.446 0.150 -0.530 3.420
xo2:xa2[T.1] 2.0213 0.939 2.152 0.033 0.166 3.877
xo2:xa2[T.2] 1.2757 0.977 1.305 0.194 -0.656 3.207
xo2:xa2[T.3] -0.2179 3.941 -0.055 0.956 -8.004 7.568
xo2:xa2[T.4] 1.8662 1.010 1.847 0.067 -0.130 3.862
xo2:xa2[T.5] 0.9494 1.053 0.902 0.369 -1.131 3.030
xo2:xa2[T.6] 0.1386 1.169 0.119 0.906 -2.172 2.449
xo2:xa2[T.7] 0.5402 1.013 0.534 0.594 -1.460 2.541
xo2:xa2[T.8] 1.3029 1.152 1.131 0.260 -0.973 3.579
xo2:xa2[T.9] 1.9833 1.083 1.832 0.069 -0.156 4.122
xo2:xa2[T.10] 3.4405 1.312 2.622 0.010 0.848 6.033
xo2:xa2[T.11] 1.4452 1.000 1.446 0.150 -0.530 3.420
xo2:xa3[T.latin] -0.6603 1.635 -0.404 0.687 -3.890 2.570
xo2:xa3[T.pop] -2.3525 3.005 -0.783 0.435 -8.290 3.585
xo2:xa3[T.r&b] -1.7764 1.374 -1.293 0.198 -4.490 0.938
xo2:xa3[T.rap] 0.6932 1.349 0.514 0.608 -1.971 3.358
xo2:xa3[T.rock] -0.8747 1.421 -0.616 0.539 -3.683 1.933
xo3 0.4991 1.149 0.434 0.665 -1.772 2.770
xo3:xa1[T.1] -0.7195 0.794 -0.906 0.366 -2.289 0.850
xo3:xa1[T.2] -0.5816 0.747 -0.779 0.437 -2.057 0.894
xo3:xa1[T.3] 0.5736 3.355 0.171 0.864 -6.055 7.203
xo3:xa1[T.4] -1.0264 0.777 -1.321 0.188 -2.561 0.509
xo3:xa1[T.5] -0.0077 1.061 -0.007 0.994 -2.103 2.088
xo3:xa1[T.6] 0.3226 0.947 0.341 0.734 -1.549 2.194
xo3:xa1[T.7] 0.5335 0.885 0.603 0.548 -1.216 2.283
xo3:xa1[T.8] -0.6243 1.132 -0.551 0.582 -2.862 1.613
xo3:xa1[T.9] -1.4522 0.895 -1.623 0.107 -3.220 0.315
xo3:xa1[T.10] -1.2099 1.026 -1.179 0.240 -3.237 0.817
xo3:xa1[T.11] -0.5441 0.963 -0.565 0.573 -2.447 1.359
xo3:xa2[T.1] -0.7195 0.794 -0.906 0.366 -2.289 0.850
xo3:xa2[T.2] -0.5816 0.747 -0.779 0.437 -2.057 0.894
xo3:xa2[T.3] 0.5736 3.355 0.171 0.864 -6.055 7.203
xo3:xa2[T.4] -1.0264 0.777 -1.321 0.188 -2.561 0.509
xo3:xa2[T.5] -0.0077 1.061 -0.007 0.994 -2.103 2.088
xo3:xa2[T.6] 0.3226 0.947 0.341 0.734 -1.549 2.194
xo3:xa2[T.7] 0.5335 0.885 0.603 0.548 -1.216 2.283
xo3:xa2[T.8] -0.6243 1.132 -0.551 0.582 -2.862 1.613
xo3:xa2[T.9] -1.4522 0.895 -1.623 0.107 -3.220 0.315
xo3:xa2[T.10] -1.2099 1.026 -1.179 0.240 -3.237 0.817
xo3:xa2[T.11] -0.5441 0.963 -0.565 0.573 -2.447 1.359
xo3:xa3[T.latin] 1.6684 0.895 1.864 0.064 -0.100 3.437
xo3:xa3[T.pop] 1.4145 1.784 0.793 0.429 -2.111 4.940
xo3:xa3[T.r&b] 0.1468 1.017 0.144 0.885 -1.863 2.157
xo3:xa3[T.rap] -1.2652 0.998 -1.268 0.207 -3.237 0.707
xo3:xa3[T.rock] 1.7717 1.097 1.615 0.108 -0.396 3.939
xo4 1.0250 1.257 0.815 0.416 -1.459 3.509
xo4:xa1[T.1] -0.8881 0.519 -1.710 0.089 -1.914 0.138
xo4:xa1[T.2] -0.4965 0.876 -0.567 0.572 -2.227 1.234
xo4:xa1[T.3] -0.1013 1.029 -0.098 0.922 -2.134 1.932
xo4:xa1[T.4] -0.4308 0.616 -0.699 0.486 -1.649 0.787
xo4:xa1[T.5] -1.2516 0.728 -1.718 0.088 -2.691 0.188
xo4:xa1[T.6] -0.7456 0.565 -1.319 0.189 -1.862 0.371
xo4:xa1[T.7] -0.4874 0.558 -0.873 0.384 -1.590 0.615
xo4:xa1[T.8] -0.3769 0.563 -0.670 0.504 -1.489 0.735
xo4:xa1[T.9] -0.1115 0.666 -0.167 0.867 -1.428 1.205
xo4:xa1[T.10] 1.2723 0.708 1.798 0.074 -0.126 2.671
xo4:xa1[T.11] -0.7596 0.541 -1.404 0.162 -1.828 0.309
xo4:xa2[T.1] -0.8881 0.519 -1.710 0.089 -1.914 0.138
xo4:xa2[T.2] -0.4965 0.876 -0.567 0.572 -2.227 1.234
xo4:xa2[T.3] -0.1013 1.029 -0.098 0.922 -2.134 1.932
xo4:xa2[T.4] -0.4308 0.616 -0.699 0.486 -1.649 0.787
xo4:xa2[T.5] -1.2516 0.728 -1.718 0.088 -2.691 0.188
xo4:xa2[T.6] -0.7456 0.565 -1.319 0.189 -1.862 0.371
xo4:xa2[T.7] -0.4874 0.558 -0.873 0.384 -1.590 0.615
xo4:xa2[T.8] -0.3769 0.563 -0.670 0.504 -1.489 0.735
xo4:xa2[T.9] -0.1115 0.666 -0.167 0.867 -1.428 1.205
xo4:xa2[T.10] 1.2723 0.708 1.798 0.074 -0.126 2.671
xo4:xa2[T.11] -0.7596 0.541 -1.404 0.162 -1.828 0.309
xo4:xa3[T.latin] -1.4793 1.325 -1.117 0.266 -4.097 1.138
xo4:xa3[T.pop] 3.6999 4.819 0.768 0.444 -5.821 13.221
xo4:xa3[T.r&b] 0.0839 1.193 0.070 0.944 -2.273 2.441
xo4:xa3[T.rap] -0.4104 1.159 -0.354 0.724 -2.700 1.879
xo4:xa3[T.rock] -0.8676 1.572 -0.552 0.582 -3.974 2.239
xo5 0.3113 1.477 0.211 0.833 -2.606 3.229
xo5:xa1[T.1] -0.2603 0.816 -0.319 0.750 -1.873 1.353
xo5:xa1[T.2] -0.9027 0.841 -1.073 0.285 -2.565 0.759
xo5:xa1[T.3] -1.0795 1.416 -0.763 0.447 -3.876 1.717
xo5:xa1[T.4] -1.2134 0.926 -1.310 0.192 -3.043 0.617
xo5:xa1[T.5] -1.1406 0.908 -1.256 0.211 -2.935 0.654
xo5:xa1[T.6] -1.5190 0.892 -1.704 0.090 -3.280 0.242
xo5:xa1[T.7] -0.9675 0.857 -1.129 0.261 -2.661 0.726
xo5:xa1[T.8] -1.4796 1.026 -1.442 0.151 -3.507 0.548
xo5:xa1[T.9] -0.8323 0.881 -0.945 0.346 -2.573 0.908
xo5:xa1[T.10] -0.0764 0.893 -0.086 0.932 -1.842 1.689
xo5:xa1[T.11] -0.0721 0.999 -0.072 0.943 -2.046 1.902
xo5:xa2[T.1] -0.2603 0.816 -0.319 0.750 -1.873 1.353
xo5:xa2[T.2] -0.9027 0.841 -1.073 0.285 -2.565 0.759
xo5:xa2[T.3] -1.0795 1.416 -0.763 0.447 -3.876 1.717
xo5:xa2[T.4] -1.2134 0.926 -1.310 0.192 -3.043 0.617
xo5:xa2[T.5] -1.1406 0.908 -1.256 0.211 -2.935 0.654
xo5:xa2[T.6] -1.5190 0.892 -1.704 0.090 -3.280 0.242
xo5:xa2[T.7] -0.9675 0.857 -1.129 0.261 -2.661 0.726
xo5:xa2[T.8] -1.4796 1.026 -1.442 0.151 -3.507 0.548
xo5:xa2[T.9] -0.8323 0.881 -0.945 0.346 -2.573 0.908
xo5:xa2[T.10] -0.0764 0.893 -0.086 0.932 -1.842 1.689
xo5:xa2[T.11] -0.0721 0.999 -0.072 0.943 -2.046 1.902
xo5:xa3[T.latin] 1.3261 0.892 1.487 0.139 -0.436 3.088
xo5:xa3[T.pop] -1.2830 2.041 -0.629 0.531 -5.316 2.750
xo5:xa3[T.r&b] 1.2391 0.832 1.489 0.139 -0.405 2.883
xo5:xa3[T.rap] 1.9848 0.989 2.007 0.047 0.031 3.938
xo5:xa3[T.rock] 2.3109 0.923 2.505 0.013 0.488 4.134
xo6 0.3592 1.401 0.256 0.798 -2.409 3.127
xo6:xa1[T.1] -6.2104 14.562 -0.426 0.670 -34.982 22.561
xo6:xa1[T.2] -0.5870 0.651 -0.902 0.369 -1.873 0.699
xo6:xa1[T.3] -0.0093 0.191 -0.049 0.961 -0.387 0.369
xo6:xa1[T.4] 3.6833 3.296 1.118 0.266 -2.828 10.195
xo6:xa1[T.5] -0.7045 2.518 -0.280 0.780 -5.680 4.271
xo6:xa1[T.6] 0.0755 1.354 0.056 0.956 -2.600 2.751
xo6:xa1[T.7] -0.0893 3.051 -0.029 0.977 -6.118 5.939
xo6:xa1[T.8] -1.2981 0.749 -1.733 0.085 -2.778 0.182
xo6:xa1[T.9] 0.7883 2.229 0.354 0.724 -3.617 5.193
xo6:xa1[T.10] -0.9950 0.697 -1.428 0.155 -2.372 0.382
xo6:xa1[T.11] 7.6821 5.287 1.453 0.148 -2.763 18.127
xo6:xa2[T.1] -6.2104 14.562 -0.426 0.670 -34.982 22.561
xo6:xa2[T.2] -0.5870 0.651 -0.902 0.369 -1.873 0.699
xo6:xa2[T.3] -0.0093 0.191 -0.049 0.961 -0.387 0.369
xo6:xa2[T.4] 3.6833 3.296 1.118 0.266 -2.828 10.195
xo6:xa2[T.5] -0.7045 2.518 -0.280 0.780 -5.680 4.271
xo6:xa2[T.6] 0.0755 1.354 0.056 0.956 -2.600 2.751
xo6:xa2[T.7] -0.0893 3.051 -0.029 0.977 -6.118 5.939
xo6:xa2[T.8] -1.2981 0.749 -1.733 0.085 -2.778 0.182
xo6:xa2[T.9] 0.7883 2.229 0.354 0.724 -3.617 5.193
xo6:xa2[T.10] -0.9950 0.697 -1.428 0.155 -2.372 0.382
xo6:xa2[T.11] 7.6821 5.287 1.453 0.148 -2.763 18.127
xo6:xa3[T.latin] -2.0025 4.397 -0.455 0.649 -10.690 6.685
xo6:xa3[T.pop] 1.6089 2.395 0.672 0.503 -3.124 6.341
xo6:xa3[T.r&b] -11.4814 7.059 -1.627 0.106 -25.428 2.465
xo6:xa3[T.rap] 0.5509 1.163 0.474 0.636 -1.747 2.849
xo6:xa3[T.rock] 0.9683 1.635 0.592 0.555 -2.262 4.198
xo7 2.3350 2.014 1.159 0.248 -1.644 6.314
xo7:xa1[T.1] -1.2628 0.549 -2.301 0.023 -2.347 -0.178
xo7:xa1[T.2] -0.9259 0.948 -0.977 0.330 -2.799 0.947
xo7:xa1[T.3] -1.8114 0.997 -1.816 0.071 -3.782 0.159
xo7:xa1[T.4] -1.4651 0.645 -2.273 0.024 -2.739 -0.192
xo7:xa1[T.5] -1.6588 0.617 -2.690 0.008 -2.877 -0.440
xo7:xa1[T.6] -0.1713 0.677 -0.253 0.801 -1.508 1.166
xo7:xa1[T.7] -0.8753 0.556 -1.573 0.118 -1.975 0.224
xo7:xa1[T.8] -0.2140 0.650 -0.329 0.742 -1.498 1.070
xo7:xa1[T.9] -1.2089 0.548 -2.208 0.029 -2.291 -0.127
xo7:xa1[T.10] -0.9398 0.696 -1.351 0.179 -2.315 0.435
xo7:xa1[T.11] -1.6892 0.725 -2.331 0.021 -3.121 -0.257
xo7:xa2[T.1] -1.2628 0.549 -2.301 0.023 -2.347 -0.178
xo7:xa2[T.2] -0.9259 0.948 -0.977 0.330 -2.799 0.947
xo7:xa2[T.3] -1.8114 0.997 -1.816 0.071 -3.782 0.159
xo7:xa2[T.4] -1.4651 0.645 -2.273 0.024 -2.739 -0.192
xo7:xa2[T.5] -1.6588 0.617 -2.690 0.008 -2.877 -0.440
xo7:xa2[T.6] -0.1713 0.677 -0.253 0.801 -1.508 1.166
xo7:xa2[T.7] -0.8753 0.556 -1.573 0.118 -1.975 0.224
xo7:xa2[T.8] -0.2140 0.650 -0.329 0.742 -1.498 1.070
xo7:xa2[T.9] -1.2089 0.548 -2.208 0.029 -2.291 -0.127
xo7:xa2[T.10] -0.9398 0.696 -1.351 0.179 -2.315 0.435
xo7:xa2[T.11] -1.6892 0.725 -2.331 0.021 -3.121 -0.257
xo7:xa3[T.latin] 0.0415 1.962 0.021 0.983 -3.835 3.918
xo7:xa3[T.pop] -0.0773 2.387 -0.032 0.974 -4.793 4.638
xo7:xa3[T.r&b] -0.3671 1.855 -0.198 0.843 -4.032 3.297
xo7:xa3[T.rap] -0.9593 1.871 -0.513 0.609 -4.656 2.737
xo7:xa3[T.rock] -0.4558 1.940 -0.235 0.815 -4.289 3.377
xo8 -1.6737 0.863 -1.940 0.054 -3.378 0.030
xo8:xa1[T.1] 0.2924 0.564 0.519 0.605 -0.821 1.406
xo8:xa1[T.2] 0.3394 0.707 0.480 0.632 -1.058 1.737
xo8:xa1[T.3] 1.8609 3.452 0.539 0.591 -4.960 8.682
xo8:xa1[T.4] 0.6837 0.697 0.981 0.328 -0.694 2.061
xo8:xa1[T.5] 1.4097 0.704 2.002 0.047 0.018 2.801
xo8:xa1[T.6] 0.5712 0.744 0.768 0.444 -0.898 2.041
xo8:xa1[T.7] 0.3074 0.615 0.500 0.618 -0.908 1.522
xo8:xa1[T.8] 0.3697 0.987 0.375 0.708 -1.580 2.320
xo8:xa1[T.9] 0.4564 0.655 0.697 0.487 -0.838 1.751
xo8:xa1[T.10] 0.2133 0.871 0.245 0.807 -1.508 1.935
xo8:xa1[T.11] -0.0309 0.695 -0.044 0.965 -1.404 1.342
xo8:xa2[T.1] 0.2924 0.564 0.519 0.605 -0.821 1.406
xo8:xa2[T.2] 0.3394 0.707 0.480 0.632 -1.058 1.737
xo8:xa2[T.3] 1.8609 3.452 0.539 0.591 -4.960 8.682
xo8:xa2[T.4] 0.6837 0.697 0.981 0.328 -0.694 2.061
xo8:xa2[T.5] 1.4097 0.704 2.002 0.047 0.018 2.801
xo8:xa2[T.6] 0.5712 0.744 0.768 0.444 -0.898 2.041
xo8:xa2[T.7] 0.3074 0.615 0.500 0.618 -0.908 1.522
xo8:xa2[T.8] 0.3697 0.987 0.375 0.708 -1.580 2.320
xo8:xa2[T.9] 0.4564 0.655 0.697 0.487 -0.838 1.751
xo8:xa2[T.10] 0.2133 0.871 0.245 0.807 -1.508 1.935
xo8:xa2[T.11] -0.0309 0.695 -0.044 0.965 -1.404 1.342
xo8:xa3[T.latin] 0.3712 0.758 0.490 0.625 -1.127 1.869
xo8:xa3[T.pop] 1.5037 1.301 1.155 0.250 -1.068 4.075
xo8:xa3[T.r&b] 0.7912 0.774 1.022 0.308 -0.738 2.321
xo8:xa3[T.rap] -0.5526 0.681 -0.811 0.418 -1.898 0.793
xo8:xa3[T.rock] 0.5961 0.713 0.836 0.404 -0.812 2.005
xo9 0.4098 1.003 0.409 0.683 -1.572 2.392
xo9:xa1[T.1] 0.6136 0.545 1.125 0.262 -0.464 1.691
xo9:xa1[T.2] 0.6009 0.714 0.841 0.401 -0.810 2.012
xo9:xa1[T.3] -0.5054 0.850 -0.595 0.553 -2.184 1.174
xo9:xa1[T.4] -0.3995 0.651 -0.614 0.540 -1.686 0.887
xo9:xa1[T.5] 0.4484 0.622 0.720 0.472 -0.781 1.678
xo9:xa1[T.6] -0.0595 0.652 -0.091 0.927 -1.348 1.229
xo9:xa1[T.7] -0.1943 0.598 -0.325 0.746 -1.375 0.987
xo9:xa1[T.8] 0.9613 0.645 1.490 0.138 -0.313 2.236
xo9:xa1[T.9] 0.2865 0.602 0.476 0.635 -0.902 1.475
xo9:xa1[T.10] -0.9223 0.634 -1.455 0.148 -2.175 0.330
xo9:xa1[T.11] 0.3207 0.576 0.557 0.578 -0.816 1.458
xo9:xa2[T.1] 0.6136 0.545 1.125 0.262 -0.464 1.691
xo9:xa2[T.2] 0.6009 0.714 0.841 0.401 -0.810 2.012
xo9:xa2[T.3] -0.5054 0.850 -0.595 0.553 -2.184 1.174
xo9:xa2[T.4] -0.3995 0.651 -0.614 0.540 -1.686 0.887
xo9:xa2[T.5] 0.4484 0.622 0.720 0.472 -0.781 1.678
xo9:xa2[T.6] -0.0595 0.652 -0.091 0.927 -1.348 1.229
xo9:xa2[T.7] -0.1943 0.598 -0.325 0.746 -1.375 0.987
xo9:xa2[T.8] 0.9613 0.645 1.490 0.138 -0.313 2.236
xo9:xa2[T.9] 0.2865 0.602 0.476 0.635 -0.902 1.475
xo9:xa2[T.10] -0.9223 0.634 -1.455 0.148 -2.175 0.330
xo9:xa2[T.11] 0.3207 0.576 0.557 0.578 -0.816 1.458
xo9:xa3[T.latin] -0.5425 0.953 -0.569 0.570 -2.425 1.340
xo9:xa3[T.pop] -1.1249 1.400 -0.803 0.423 -3.891 1.641
xo9:xa3[T.r&b] -0.3151 0.735 -0.429 0.669 -1.768 1.137
xo9:xa3[T.rap] -0.9778 0.686 -1.426 0.156 -2.333 0.377
xo9:xa3[T.rock] -0.6716 0.823 -0.816 0.416 -2.298 0.955
xo10 -0.9034 1.126 -0.802 0.424 -3.128 1.322
xo10:xa1[T.1] 0.9318 0.623 1.495 0.137 -0.299 2.163
xo10:xa1[T.2] 0.6253 0.637 0.982 0.328 -0.632 1.883
xo10:xa1[T.3] 1.4560 1.084 1.343 0.181 -0.686 3.598
xo10:xa1[T.4] 1.2895 0.718 1.796 0.074 -0.129 2.708
xo10:xa1[T.5] -0.0017 0.725 -0.002 0.998 -1.435 1.431
xo10:xa1[T.6] 0.7694 0.572 1.346 0.180 -0.360 1.899
xo10:xa1[T.7] 0.0174 0.649 0.027 0.979 -1.265 1.300
xo10:xa1[T.8] 0.4307 0.613 0.703 0.483 -0.780 1.642
xo10:xa1[T.9] -0.2275 0.812 -0.280 0.780 -1.832 1.377
xo10:xa1[T.10] 0.4588 0.660 0.696 0.488 -0.844 1.762
xo10:xa1[T.11] 0.5187 0.651 0.797 0.427 -0.768 1.805
xo10:xa2[T.1] 0.9318 0.623 1.495 0.137 -0.299 2.163
xo10:xa2[T.2] 0.6253 0.637 0.982 0.328 -0.632 1.883
xo10:xa2[T.3] 1.4560 1.084 1.343 0.181 -0.686 3.598
xo10:xa2[T.4] 1.2895 0.718 1.796 0.074 -0.129 2.708
xo10:xa2[T.5] -0.0017 0.725 -0.002 0.998 -1.435 1.431
xo10:xa2[T.6] 0.7694 0.572 1.346 0.180 -0.360 1.899
xo10:xa2[T.7] 0.0174 0.649 0.027 0.979 -1.265 1.300
xo10:xa2[T.8] 0.4307 0.613 0.703 0.483 -0.780 1.642
xo10:xa2[T.9] -0.2275 0.812 -0.280 0.780 -1.832 1.377
xo10:xa2[T.10] 0.4588 0.660 0.696 0.488 -0.844 1.762
xo10:xa2[T.11] 0.5187 0.651 0.797 0.427 -0.768 1.805
xo10:xa3[T.latin] -0.5308 0.919 -0.577 0.565 -2.347 1.286
xo10:xa3[T.pop] -0.3842 1.815 -0.212 0.833 -3.970 3.201
xo10:xa3[T.r&b] -0.4796 0.952 -0.504 0.615 -2.360 1.400
xo10:xa3[T.rap] 0.5616 0.804 0.698 0.486 -1.028 2.151
xo10:xa3[T.rock] -0.9874 0.858 -1.151 0.252 -2.683 0.708
==============================================================================
Omnibus: 22.817 Durbin-Watson: 2.054
Prob(Omnibus): 0.000 Jarque-Bera (JB): 26.749
Skew: -0.594 Prob(JB): 1.55e-06
Kurtosis: 3.738 Cond. No. 1.21e+17
==============================================================================
Notes:
[1] Standard Errors assume that the covariance matrix of the errors is correctly specified.
[2] The smallest eigenvalue is 8.08e-32. This might indicate that there are
strong multicollinearity problems or that the design matrix is singular.
fit_05.params
Intercept 1.149640
xa1[T.1] -1.668951
xa1[T.2] -0.559869
xa1[T.3] -0.003865
xa1[T.4] 0.874659
...
xo10:xa3[T.latin] -0.530830
xo10:xa3[T.pop] -0.384192
xo10:xa3[T.r&b] -0.479647
xo10:xa3[T.rap] 0.561590
xo10:xa3[T.rock] -0.987449
Length: 308, dtype: float64
len(fit_05.params)
308
fit_05.pvalues < 0.05
Intercept False
xa1[T.1] False
xa1[T.2] False
xa1[T.3] False
xa1[T.4] False
...
xo10:xa3[T.latin] False
xo10:xa3[T.pop] False
xo10:xa3[T.r&b] False
xo10:xa3[T.rap] False
xo10:xa3[T.rock] False
Length: 308, dtype: bool
fit_05:
i: 308.
ii: 18.
xa3[T.r&b],
xo2:xa1[T.1],
xo2:xa1[T.10],
xo2:xa2[T.1],
xo2:xa2[T.10],
xo5:xa3[T.rap],
xo5:xa3[T.rock],
xo7:xa1[T.1],
xo7:xa1[T.4],
xo7:xa1[T.9],
xo7:xa1[T.11],
xo7:xa2[T.1],
xo7:xa2[T.4],
xo7:xa2[T.5],
xo7:xa2[T.9],
xo7:xa2[T.11],
xo8:xa1[T.5],
xo8:xa2[T.5].
iii:
xa3[T.r&b],-4.5357 (negative) xo2:xa1[T.1], 2.021 (positive) 3 xo2:xa1[T.10],3.44 (positive) 05 xo2:xa2[T.1],2.0 (positive) 213 xo2:xa2[T.10],3. (positive) 4405 xo5:xa3[T.rap],1 (positive) .9848 xo5:xa3[T.rock], (positive) 2.3109 xo7:xa1[T.1], (negative) -1.2628 xo7:xa1[T.4],(negative) -1.4651 xo7:xa1[T.9 (negative) ],-1.2089 xo7:xa1[T.1 (negative)1 ],-1.6892 xo7:xa2[T.(negative) 1],-1.2628 xo7:xa2[T(negative) .4],-1.4651 xo7:xa2[T(negative) .5], -1.6588 xo7:xa2[(negative) T.9], -1.2089 xo7:xa2 (negative) [T.11], -1.6892 xo8:(positive) xa1[T.5],1.4097 xo8 (positive):xa2[Txa3[T.r&b], xo2:xa2[T.10]cept and xo2:xo8.
fit_06 = smf.ols(formula=formula_list[6], data=df5_new).fit()
print(fit_06.summary())
OLS Regression Results
==============================================================================
Dep. Variable: y R-squared: 0.150
Model: OLS Adj. R-squared: 0.045
Method: Least Squares F-statistic: 1.429
Date: Thu, 12 Dec 2024 Prob (F-statistic): 0.0598
Time: 20:31:07 Log-Likelihood: -741.77
No. Observations: 328 AIC: 1558.
Df Residuals: 291 BIC: 1698.
Df Model: 36
Covariance Type: nonrobust
=====================================================================================
coef std err t P>|t| [0.025 0.975]
-------------------------------------------------------------------------------------
Intercept 0.4328 1.619 0.267 0.789 -2.754 3.619
xa1[T.1] 0.0241 0.344 0.070 0.944 -0.652 0.701
xa1[T.2] 0.1001 0.361 0.277 0.782 -0.610 0.810
xa1[T.3] 0.4875 0.501 0.973 0.332 -0.499 1.474
xa1[T.4] 0.1498 0.363 0.412 0.680 -0.565 0.865
xa1[T.5] 0.0240 0.367 0.065 0.948 -0.698 0.746
xa1[T.6] 0.0305 0.358 0.085 0.932 -0.674 0.735
xa1[T.7] 0.4518 0.358 1.262 0.208 -0.253 1.157
xa1[T.8] -0.0742 0.381 -0.195 0.846 -0.825 0.676
xa1[T.9] 0.0765 0.361 0.212 0.832 -0.633 0.786
xa1[T.10] -0.0837 0.375 -0.223 0.824 -0.822 0.654
xa1[T.11] -0.1473 0.349 -0.422 0.674 -0.835 0.541
xa2[T.1] 0.0241 0.344 0.070 0.944 -0.652 0.701
xa2[T.2] 0.1001 0.361 0.277 0.782 -0.610 0.810
xa2[T.3] 0.4875 0.501 0.973 0.332 -0.499 1.474
xa2[T.4] 0.1498 0.363 0.412 0.680 -0.565 0.865
xa2[T.5] 0.0240 0.367 0.065 0.948 -0.698 0.746
xa2[T.6] 0.0305 0.358 0.085 0.932 -0.674 0.735
xa2[T.7] 0.4518 0.358 1.262 0.208 -0.253 1.157
xa2[T.8] -0.0742 0.381 -0.195 0.846 -0.825 0.676
xa2[T.9] 0.0765 0.361 0.212 0.832 -0.633 0.786
xa2[T.10] -0.0837 0.375 -0.223 0.824 -0.822 0.654
xa2[T.11] -0.1473 0.349 -0.422 0.674 -0.835 0.541
xa3[T.latin] -2.3602 1.544 -1.528 0.127 -5.399 0.679
xa3[T.pop] -0.4356 1.637 -0.266 0.790 -3.657 2.786
xa3[T.r&b] -1.9654 1.539 -1.277 0.203 -4.995 1.064
xa3[T.rap] -2.2077 1.552 -1.423 0.156 -5.262 0.846
xa3[T.rock] -1.2367 1.561 -0.792 0.429 -4.310 1.836
xo1 -0.1120 0.240 -0.466 0.641 -0.585 0.361
xo2 -0.6340 0.297 -2.132 0.034 -1.219 -0.049
xo3 0.4142 0.279 1.483 0.139 -0.135 0.964
xo4 -0.0848 0.285 -0.297 0.767 -0.647 0.477
xo5 0.0749 0.272 0.276 0.783 -0.460 0.609
xo6 -0.2345 0.470 -0.499 0.618 -1.159 0.690
xo7 -0.0748 0.235 -0.318 0.751 -0.537 0.388
xo8 -0.0008 0.199 -0.004 0.997 -0.393 0.391
xo9 0.3207 0.198 1.623 0.106 -0.068 0.710
xo10 -0.4433 0.193 -2.293 0.023 -0.824 -0.063
np.power(xo1, 2) 0.1066 0.143 0.748 0.455 -0.174 0.387
np.power(xo2, 2) -0.1728 0.161 -1.070 0.285 -0.491 0.145
np.power(xo3, 2) -0.0752 0.084 -0.899 0.369 -0.240 0.089
np.power(xo4, 2) -0.0146 0.151 -0.097 0.923 -0.312 0.283
np.power(xo5, 2) 0.0204 0.131 0.156 0.876 -0.237 0.278
np.power(xo6, 2) 0.0167 0.070 0.238 0.812 -0.121 0.155
np.power(xo7, 2) -0.0482 0.098 -0.491 0.624 -0.242 0.145
np.power(xo8, 2) 0.0528 0.130 0.406 0.685 -0.203 0.309
np.power(xo9, 2) 0.0165 0.162 0.102 0.919 -0.302 0.335
np.power(xo10, 2) 0.0012 0.086 0.014 0.989 -0.168 0.171
==============================================================================
Omnibus: 45.348 Durbin-Watson: 1.873
Prob(Omnibus): 0.000 Jarque-Bera (JB): 61.390
Skew: -1.056 Prob(JB): 4.67e-14
Kurtosis: 3.177 Cond. No. 1.54e+16
==============================================================================
Notes:
[1] Standard Errors assume that the covariance matrix of the errors is correctly specified.
[2] The smallest eigenvalue is 6.24e-29. This might indicate that there are
strong multicollinearity problems or that the design matrix is singular.
fit_06.params
Intercept 0.432819 xa1[T.1] 0.024142 xa1[T.2] 0.100098 xa1[T.3] 0.487453 xa1[T.4] 0.149793 xa1[T.5] 0.024035 xa1[T.6] 0.030522 xa1[T.7] 0.451839 xa1[T.8] -0.074178 xa1[T.9] 0.076478 xa1[T.10] -0.083696 xa1[T.11] -0.147333 xa2[T.1] 0.024142 xa2[T.2] 0.100098 xa2[T.3] 0.487453 xa2[T.4] 0.149793 xa2[T.5] 0.024035 xa2[T.6] 0.030522 xa2[T.7] 0.451839 xa2[T.8] -0.074178 xa2[T.9] 0.076478 xa2[T.10] -0.083696 xa2[T.11] -0.147333 xa3[T.latin] -2.360171 xa3[T.pop] -0.435641 xa3[T.r&b] -1.965420 xa3[T.rap] -2.207750 xa3[T.rock] -1.236741 xo1 -0.112018 xo2 -0.634028 xo3 0.414208 xo4 -0.084765 xo5 0.074885 xo6 -0.234542 xo7 -0.074800 xo8 -0.000823 xo9 0.320686 xo10 -0.443323 np.power(xo1, 2) 0.106613 np.power(xo2, 2) -0.172809 np.power(xo3, 2) -0.075242 np.power(xo4, 2) -0.014610 np.power(xo5, 2) 0.020402 np.power(xo6, 2) 0.016719 np.power(xo7, 2) -0.048214 np.power(xo8, 2) 0.052787 np.power(xo9, 2) 0.016493 np.power(xo10, 2) 0.001183 dtype: float64
len(fit_06.params)
48
fit_06.pvalues < 0.05
Intercept False xa1[T.1] False xa1[T.2] False xa1[T.3] False xa1[T.4] False xa1[T.5] False xa1[T.6] False xa1[T.7] False xa1[T.8] False xa1[T.9] False xa1[T.10] False xa1[T.11] False xa2[T.1] False xa2[T.2] False xa2[T.3] False xa2[T.4] False xa2[T.5] False xa2[T.6] False xa2[T.7] False xa2[T.8] False xa2[T.9] False xa2[T.10] False xa2[T.11] False xa3[T.latin] False xa3[T.pop] False xa3[T.r&b] False xa3[T.rap] False xa3[T.rock] False xo1 False xo2 True xo3 False xo4 False xo5 False xo6 False xo7 False xo8 False xo9 False xo10 True np.power(xo1, 2) False np.power(xo2, 2) False np.power(xo3, 2) False np.power(xo4, 2) False np.power(xo5, 2) False np.power(xo6, 2) False np.power(xo7, 2) False np.power(xo8, 2) False np.power(xo9, 2) False np.power(xo10, 2) False dtype: bool
fit_06:
i:48.
ii: 2. xo2, xo10.
iii:
xo2 -0.634028(negative)
xo10 -0.443323 (negative).
iv: xo2, xo10.
fit_07 = smf.ols(formula=formula_list[7], data=df5_new).fit()
print(fit_07.summary())
OLS Regression Results
==============================================================================
Dep. Variable: y R-squared: 0.960
Model: OLS Adj. R-squared: 0.378
Method: Least Squares F-statistic: 1.648
Date: Thu, 12 Dec 2024 Prob (F-statistic): 0.0868
Time: 20:31:12 Log-Likelihood: -240.45
No. Observations: 328 AIC: 1095.
Df Residuals: 21 BIC: 2259.
Df Model: 306
Covariance Type: nonrobust
==================================================================================================
coef std err t P>|t| [0.025 0.975]
--------------------------------------------------------------------------------------------------
Intercept -13.6937 12.783 -1.071 0.296 -40.278 12.890
xa1[T.1] 31.5047 15.588 2.021 0.056 -0.913 63.922
xa1[T.2] 7.3413 10.712 0.685 0.501 -14.936 29.618
xa1[T.3] 25.4538 10.885 2.338 0.029 2.818 48.090
xa1[T.4] -22.0020 26.980 -0.816 0.424 -78.109 34.105
xa1[T.5] 28.6348 28.079 1.020 0.319 -29.758 87.028
xa1[T.6] -0.4111 7.827 -0.053 0.959 -16.688 15.866
xa1[T.7] 22.4322 16.159 1.388 0.180 -11.172 56.036
xa1[T.8] -2.4220 24.231 -0.100 0.921 -52.814 47.970
xa1[T.9] -7.9253 13.546 -0.585 0.565 -36.096 20.245
xa1[T.10] 70.0216 30.741 2.278 0.033 6.092 133.951
xa1[T.11] -22.7380 21.140 -1.076 0.294 -66.700 21.224
xa2[T.1] 31.5047 15.588 2.021 0.056 -0.913 63.922
xa2[T.2] 7.3413 10.712 0.685 0.501 -14.936 29.618
xa2[T.3] 25.4538 10.885 2.338 0.029 2.818 48.090
xa2[T.4] -22.0020 26.980 -0.816 0.424 -78.109 34.105
xa2[T.5] 28.6348 28.079 1.020 0.319 -29.758 87.028
xa2[T.6] -0.4111 7.827 -0.053 0.959 -16.688 15.866
xa2[T.7] 22.4322 16.159 1.388 0.180 -11.172 56.036
xa2[T.8] -2.4220 24.231 -0.100 0.921 -52.814 47.970
xa2[T.9] -7.9253 13.546 -0.585 0.565 -36.096 20.245
xa2[T.10] 70.0216 30.741 2.278 0.033 6.092 133.951
xa2[T.11] -22.7380 21.140 -1.076 0.294 -66.700 21.224
xa3[T.latin] -13.8324 11.927 -1.160 0.259 -38.635 10.970
xa3[T.pop] -65.0616 49.542 -1.313 0.203 -168.089 37.966
xa3[T.r&b] 86.2249 43.103 2.000 0.059 -3.413 175.863
xa3[T.rap] -16.7620 16.893 -0.992 0.332 -51.893 18.369
xa3[T.rock] -15.4068 18.186 -0.847 0.406 -53.227 22.414
xo1 1.2489 10.893 0.115 0.910 -21.405 23.903
xa1[T.1]:xo1 -9.6192 6.851 -1.404 0.175 -23.866 4.628
xa1[T.2]:xo1 -9.9304 7.383 -1.345 0.193 -25.285 5.424
xa1[T.3]:xo1 -11.8855 4.478 -2.654 0.015 -21.198 -2.573
xa1[T.4]:xo1 -13.8436 8.271 -1.674 0.109 -31.045 3.358
xa1[T.5]:xo1 60.6248 46.429 1.306 0.206 -35.929 157.179
xa1[T.6]:xo1 12.6096 14.962 0.843 0.409 -18.506 43.725
xa1[T.7]:xo1 -7.8769 6.535 -1.205 0.241 -21.466 5.713
xa1[T.8]:xo1 22.0940 10.110 2.185 0.040 1.070 43.118
xa1[T.9]:xo1 37.8722 23.706 1.598 0.125 -11.427 87.171
xa1[T.10]:xo1 -8.6422 15.717 -0.550 0.588 -41.327 24.043
xa1[T.11]:xo1 -6.7500 7.115 -0.949 0.354 -21.547 8.047
xa2[T.1]:xo1 -9.6192 6.851 -1.404 0.175 -23.866 4.628
xa2[T.2]:xo1 -9.9304 7.383 -1.345 0.193 -25.285 5.424
xa2[T.3]:xo1 -11.8855 4.478 -2.654 0.015 -21.198 -2.573
xa2[T.4]:xo1 -13.8436 8.271 -1.674 0.109 -31.045 3.358
xa2[T.5]:xo1 60.6248 46.429 1.306 0.206 -35.929 157.179
xa2[T.6]:xo1 12.6096 14.962 0.843 0.409 -18.506 43.725
xa2[T.7]:xo1 -7.8769 6.535 -1.205 0.241 -21.466 5.713
xa2[T.8]:xo1 22.0940 10.110 2.185 0.040 1.070 43.118
xa2[T.9]:xo1 37.8722 23.706 1.598 0.125 -11.427 87.171
xa2[T.10]:xo1 -8.6422 15.717 -0.550 0.588 -41.327 24.043
xa2[T.11]:xo1 -6.7500 7.115 -0.949 0.354 -21.547 8.047
xa3[T.latin]:xo1 -18.2205 21.073 -0.865 0.397 -62.044 25.603
xa3[T.pop]:xo1 31.9554 16.539 1.932 0.067 -2.439 66.350
xa3[T.r&b]:xo1 14.8855 8.612 1.728 0.099 -3.024 32.795
xa3[T.rap]:xo1 -0.4766 7.102 -0.067 0.947 -15.247 14.293
xa3[T.rock]:xo1 -34.6541 12.991 -2.668 0.014 -61.670 -7.638
xo2 29.7745 17.773 1.675 0.109 -7.186 66.735
xa1[T.1]:xo2 -24.5563 13.735 -1.788 0.088 -53.120 4.007
xa1[T.2]:xo2 -4.3350 8.549 -0.507 0.617 -22.113 13.443
xa1[T.3]:xo2 -6.4697 3.412 -1.896 0.072 -13.566 0.626
xa1[T.4]:xo2 -16.7048 14.623 -1.142 0.266 -47.114 13.704
xa1[T.5]:xo2 -7.0002 18.748 -0.373 0.713 -45.988 31.988
xa1[T.6]:xo2 -13.6924 18.736 -0.731 0.473 -52.657 25.272
xa1[T.7]:xo2 -24.8694 14.649 -1.698 0.104 -55.333 5.594
xa1[T.8]:xo2 -25.1046 26.474 -0.948 0.354 -80.160 29.951
xa1[T.9]:xo2 11.3154 14.100 0.803 0.431 -18.007 40.638
xa1[T.10]:xo2 36.6143 27.152 1.348 0.192 -19.852 93.080
xa1[T.11]:xo2 -36.4555 14.949 -2.439 0.024 -67.544 -5.367
xa2[T.1]:xo2 -24.5563 13.735 -1.788 0.088 -53.120 4.007
xa2[T.2]:xo2 -4.3350 8.549 -0.507 0.617 -22.113 13.443
xa2[T.3]:xo2 -6.4697 3.412 -1.896 0.072 -13.566 0.626
xa2[T.4]:xo2 -16.7048 14.623 -1.142 0.266 -47.114 13.704
xa2[T.5]:xo2 -7.0002 18.748 -0.373 0.713 -45.988 31.988
xa2[T.6]:xo2 -13.6924 18.736 -0.731 0.473 -52.657 25.272
xa2[T.7]:xo2 -24.8694 14.649 -1.698 0.104 -55.333 5.594
xa2[T.8]:xo2 -25.1046 26.474 -0.948 0.354 -80.160 29.951
xa2[T.9]:xo2 11.3154 14.100 0.803 0.431 -18.007 40.638
xa2[T.10]:xo2 36.6143 27.152 1.348 0.192 -19.852 93.080
xa2[T.11]:xo2 -36.4555 14.949 -2.439 0.024 -67.544 -5.367
xa3[T.latin]:xo2 50.8644 21.232 2.396 0.026 6.709 95.020
xa3[T.pop]:xo2 -53.3531 28.580 -1.867 0.076 -112.788 6.082
xa3[T.r&b]:xo2 17.7864 11.686 1.522 0.143 -6.516 42.089
xa3[T.rap]:xo2 21.6763 9.662 2.243 0.036 1.583 41.770
xa3[T.rock]:xo2 -20.5510 10.581 -1.942 0.066 -42.555 1.453
xo3 -7.5362 19.955 -0.378 0.709 -49.035 33.963
xa1[T.1]:xo3 1.2145 11.048 0.110 0.914 -21.761 24.190
xa1[T.2]:xo3 -6.7166 7.353 -0.913 0.371 -22.008 8.575
xa1[T.3]:xo3 2.7691 2.639 1.049 0.306 -2.718 8.256
xa1[T.4]:xo3 -16.0716 18.858 -0.852 0.404 -55.288 23.145
xa1[T.5]:xo3 6.4507 11.787 0.547 0.590 -18.062 30.963
xa1[T.6]:xo3 -0.7036 12.711 -0.055 0.956 -27.137 25.729
xa1[T.7]:xo3 3.4320 11.632 0.295 0.771 -20.759 27.623
xa1[T.8]:xo3 -5.5327 20.324 -0.272 0.788 -47.798 36.733
xa1[T.9]:xo3 -22.4245 12.814 -1.750 0.095 -49.073 4.224
xa1[T.10]:xo3 -60.5616 26.802 -2.260 0.035 -116.298 -4.825
xa1[T.11]:xo3 19.4597 10.523 1.849 0.079 -2.423 41.343
xa2[T.1]:xo3 1.2145 11.048 0.110 0.914 -21.761 24.190
xa2[T.2]:xo3 -6.7166 7.353 -0.913 0.371 -22.008 8.575
xa2[T.3]:xo3 2.7691 2.639 1.049 0.306 -2.718 8.256
xa2[T.4]:xo3 -16.0716 18.858 -0.852 0.404 -55.288 23.145
xa2[T.5]:xo3 6.4507 11.787 0.547 0.590 -18.062 30.963
xa2[T.6]:xo3 -0.7036 12.711 -0.055 0.956 -27.137 25.729
xa2[T.7]:xo3 3.4320 11.632 0.295 0.771 -20.759 27.623
xa2[T.8]:xo3 -5.5327 20.324 -0.272 0.788 -47.798 36.733
xa2[T.9]:xo3 -22.4245 12.814 -1.750 0.095 -49.073 4.224
xa2[T.10]:xo3 -60.5616 26.802 -2.260 0.035 -116.298 -4.825
xa2[T.11]:xo3 19.4597 10.523 1.849 0.079 -2.423 41.343
xa3[T.latin]:xo3 -16.0087 9.309 -1.720 0.100 -35.368 3.351
xa3[T.pop]:xo3 -15.8463 19.839 -0.799 0.433 -57.105 25.412
xa3[T.r&b]:xo3 -0.9604 5.958 -0.161 0.873 -13.351 11.430
xa3[T.rap]:xo3 -5.3559 8.279 -0.647 0.525 -22.573 11.861
xa3[T.rock]:xo3 31.8166 16.269 1.956 0.064 -2.016 65.650
xo4 120.0631 55.843 2.150 0.043 3.931 236.195
xa1[T.1]:xo4 -68.5515 34.639 -1.979 0.061 -140.588 3.485
xa1[T.2]:xo4 -76.2567 33.974 -2.245 0.036 -146.910 -5.603
xa1[T.3]:xo4 -40.8341 17.518 -2.331 0.030 -77.264 -4.404
xa1[T.4]:xo4 -71.6890 30.402 -2.358 0.028 -134.913 -8.465
xa1[T.5]:xo4 -74.5648 36.295 -2.054 0.053 -150.045 0.916
xa1[T.6]:xo4 -70.8929 33.068 -2.144 0.044 -139.662 -2.124
xa1[T.7]:xo4 -70.2491 30.969 -2.268 0.034 -134.653 -5.845
xa1[T.8]:xo4 -43.0861 42.908 -1.004 0.327 -132.318 46.146
xa1[T.9]:xo4 -125.5201 57.314 -2.190 0.040 -244.711 -6.329
xa1[T.10]:xo4 -84.7086 41.289 -2.052 0.053 -170.573 1.156
xa1[T.11]:xo4 -73.4482 32.326 -2.272 0.034 -140.673 -6.223
xa2[T.1]:xo4 -68.5515 34.639 -1.979 0.061 -140.588 3.485
xa2[T.2]:xo4 -76.2567 33.974 -2.245 0.036 -146.910 -5.603
xa2[T.3]:xo4 -40.8341 17.518 -2.331 0.030 -77.264 -4.404
xa2[T.4]:xo4 -71.6890 30.402 -2.358 0.028 -134.913 -8.465
xa2[T.5]:xo4 -74.5648 36.295 -2.054 0.053 -150.045 0.916
xa2[T.6]:xo4 -70.8929 33.068 -2.144 0.044 -139.662 -2.124
xa2[T.7]:xo4 -70.2491 30.969 -2.268 0.034 -134.653 -5.845
xa2[T.8]:xo4 -43.0861 42.908 -1.004 0.327 -132.318 46.146
xa2[T.9]:xo4 -125.5201 57.314 -2.190 0.040 -244.711 -6.329
xa2[T.10]:xo4 -84.7086 41.289 -2.052 0.053 -170.573 1.156
xa2[T.11]:xo4 -73.4482 32.326 -2.272 0.034 -140.673 -6.223
xa3[T.latin]:xo4 17.0803 13.697 1.247 0.226 -11.404 45.565
xa3[T.pop]:xo4 -26.2518 37.649 -0.697 0.493 -104.547 52.043
xa3[T.r&b]:xo4 15.6392 7.124 2.195 0.040 0.825 30.453
xa3[T.rap]:xo4 17.0852 12.989 1.315 0.203 -9.927 44.097
xa3[T.rock]:xo4 94.6516 44.165 2.143 0.044 2.806 186.497
xo5 -20.9390 18.234 -1.148 0.264 -58.858 16.980
xa1[T.1]:xo5 7.6559 10.847 0.706 0.488 -14.902 30.213
xa1[T.2]:xo5 -6.2421 15.712 -0.397 0.695 -38.916 26.432
xa1[T.3]:xo5 -22.0823 10.317 -2.140 0.044 -43.537 -0.628
xa1[T.4]:xo5 13.3947 10.649 1.258 0.222 -8.751 35.540
xa1[T.5]:xo5 41.1616 21.780 1.890 0.073 -4.133 86.456
xa1[T.6]:xo5 18.9505 10.591 1.789 0.088 -3.075 40.975
xa1[T.7]:xo5 9.0203 10.563 0.854 0.403 -12.946 30.986
xa1[T.8]:xo5 47.7946 46.468 1.029 0.315 -48.841 144.430
xa1[T.9]:xo5 37.4434 15.936 2.350 0.029 4.303 70.584
xa1[T.10]:xo5 33.5595 27.031 1.242 0.228 -22.654 89.773
xa1[T.11]:xo5 0.4434 13.682 0.032 0.974 -28.009 28.896
xa2[T.1]:xo5 7.6559 10.847 0.706 0.488 -14.902 30.213
xa2[T.2]:xo5 -6.2421 15.712 -0.397 0.695 -38.916 26.432
xa2[T.3]:xo5 -22.0823 10.317 -2.140 0.044 -43.537 -0.628
xa2[T.4]:xo5 13.3947 10.649 1.258 0.222 -8.751 35.540
xa2[T.5]:xo5 41.1616 21.780 1.890 0.073 -4.133 86.456
xa2[T.6]:xo5 18.9505 10.591 1.789 0.088 -3.075 40.975
xa2[T.7]:xo5 9.0203 10.563 0.854 0.403 -12.946 30.986
xa2[T.8]:xo5 47.7946 46.468 1.029 0.315 -48.841 144.430
xa2[T.9]:xo5 37.4434 15.936 2.350 0.029 4.303 70.584
xa2[T.10]:xo5 33.5595 27.031 1.242 0.228 -22.654 89.773
xa2[T.11]:xo5 0.4434 13.682 0.032 0.974 -28.009 28.896
xa3[T.latin]:xo5 -5.9637 6.992 -0.853 0.403 -20.505 8.578
xa3[T.pop]:xo5 -1.7948 10.368 -0.173 0.864 -23.356 19.767
xa3[T.r&b]:xo5 9.0266 10.626 0.849 0.405 -13.072 31.125
xa3[T.rap]:xo5 7.2661 6.133 1.185 0.249 -5.488 20.020
xa3[T.rock]:xo5 -24.6302 14.724 -1.673 0.109 -55.251 5.991
xo6 87.7997 51.255 1.713 0.101 -18.792 194.391
xa1[T.1]:xo6 80.8170 56.870 1.421 0.170 -37.452 199.086
xa1[T.2]:xo6 -97.0194 45.838 -2.117 0.046 -192.344 -1.694
xa1[T.3]:xo6 -5.3659 2.292 -2.342 0.029 -10.132 -0.600
xa1[T.4]:xo6 -114.0365 93.879 -1.215 0.238 -309.268 81.195
xa1[T.5]:xo6 19.4771 92.177 0.211 0.835 -172.215 211.169
xa1[T.6]:xo6 38.3523 32.662 1.174 0.253 -29.573 106.278
xa1[T.7]:xo6 23.7828 25.348 0.938 0.359 -28.932 76.498
xa1[T.8]:xo6 -41.5649 39.220 -1.060 0.301 -123.127 39.998
xa1[T.9]:xo6 -72.1245 54.385 -1.326 0.199 -185.224 40.975
xa1[T.10]:xo6 2.1230 6.392 0.332 0.743 -11.170 15.416
xa1[T.11]:xo6 -82.8673 84.988 -0.975 0.341 -259.609 93.874
xa2[T.1]:xo6 80.8170 56.870 1.421 0.170 -37.452 199.086
xa2[T.2]:xo6 -97.0194 45.838 -2.117 0.046 -192.344 -1.694
xa2[T.3]:xo6 -5.3659 2.292 -2.342 0.029 -10.132 -0.600
xa2[T.4]:xo6 -114.0365 93.879 -1.215 0.238 -309.268 81.195
xa2[T.5]:xo6 19.4771 92.177 0.211 0.835 -172.215 211.169
xa2[T.6]:xo6 38.3523 32.662 1.174 0.253 -29.573 106.278
xa2[T.7]:xo6 23.7828 25.348 0.938 0.359 -28.932 76.498
xa2[T.8]:xo6 -41.5649 39.220 -1.060 0.301 -123.127 39.998
xa2[T.9]:xo6 -72.1245 54.385 -1.326 0.199 -185.224 40.975
xa2[T.10]:xo6 2.1230 6.392 0.332 0.743 -11.170 15.416
xa2[T.11]:xo6 -82.8673 84.988 -0.975 0.341 -259.609 93.874
xa3[T.latin]:xo6 -95.6508 40.180 -2.381 0.027 -179.210 -12.091
xa3[T.pop]:xo6 3.9782 15.368 0.259 0.798 -27.980 35.937
xa3[T.r&b]:xo6 328.9448 137.874 2.386 0.027 42.221 615.669
xa3[T.rap]:xo6 -145.6591 92.622 -1.573 0.131 -338.278 46.959
xa3[T.rock]:xo6 25.2723 31.511 0.802 0.432 -40.258 90.802
xo7 -63.5582 26.716 -2.379 0.027 -119.116 -8.000
xa1[T.1]:xo7 35.5831 15.459 2.302 0.032 3.435 67.732
xa1[T.2]:xo7 45.4763 21.910 2.076 0.050 -0.089 91.041
xa1[T.3]:xo7 2.1597 1.822 1.185 0.249 -1.629 5.948
xa1[T.4]:xo7 33.7546 14.977 2.254 0.035 2.607 64.902
xa1[T.5]:xo7 38.5275 20.527 1.877 0.074 -4.161 81.216
xa1[T.6]:xo7 39.0314 18.117 2.154 0.043 1.355 76.708
xa1[T.7]:xo7 40.9796 15.523 2.640 0.015 8.699 73.261
xa1[T.8]:xo7 17.0195 34.311 0.496 0.625 -54.334 88.373
xa1[T.9]:xo7 45.1725 18.678 2.419 0.025 6.331 84.015
xa1[T.10]:xo7 32.9818 12.516 2.635 0.015 6.953 59.011
xa1[T.11]:xo7 42.7014 17.282 2.471 0.022 6.761 78.642
xa2[T.1]:xo7 35.5831 15.459 2.302 0.032 3.435 67.732
xa2[T.2]:xo7 45.4763 21.910 2.076 0.050 -0.089 91.041
xa2[T.3]:xo7 2.1597 1.822 1.185 0.249 -1.629 5.948
xa2[T.4]:xo7 33.7546 14.977 2.254 0.035 2.607 64.902
xa2[T.5]:xo7 38.5275 20.527 1.877 0.074 -4.161 81.216
xa2[T.6]:xo7 39.0314 18.117 2.154 0.043 1.355 76.708
xa2[T.7]:xo7 40.9796 15.523 2.640 0.015 8.699 73.261
xa2[T.8]:xo7 17.0195 34.311 0.496 0.625 -54.334 88.373
xa2[T.9]:xo7 45.1725 18.678 2.419 0.025 6.331 84.015
xa2[T.10]:xo7 32.9818 12.516 2.635 0.015 6.953 59.011
xa2[T.11]:xo7 42.7014 17.282 2.471 0.022 6.761 78.642
xa3[T.latin]:xo7 -10.2309 10.272 -0.996 0.331 -31.592 11.131
xa3[T.pop]:xo7 -20.0535 16.086 -1.247 0.226 -53.507 13.400
xa3[T.r&b]:xo7 -18.0141 9.696 -1.858 0.077 -38.179 2.151
xa3[T.rap]:xo7 -10.0704 6.670 -1.510 0.146 -23.941 3.801
xa3[T.rock]:xo7 -7.1680 8.590 -0.834 0.413 -25.032 10.696
xo8 -10.1263 8.918 -1.136 0.269 -28.672 8.419
xa1[T.1]:xo8 5.3607 6.689 0.801 0.432 -8.550 19.271
xa1[T.2]:xo8 -9.8431 11.689 -0.842 0.409 -34.152 14.465
xa1[T.3]:xo8 0.1062 1.753 0.061 0.952 -3.538 3.751
xa1[T.4]:xo8 4.1629 9.100 0.457 0.652 -14.762 23.088
xa1[T.5]:xo8 -21.9346 15.142 -1.449 0.162 -53.423 9.554
xa1[T.6]:xo8 -4.7922 5.144 -0.932 0.362 -15.489 5.905
xa1[T.7]:xo8 4.5072 5.752 0.784 0.442 -7.454 16.468
xa1[T.8]:xo8 0.2406 17.323 0.014 0.989 -35.784 36.265
xa1[T.9]:xo8 -16.2694 4.823 -3.373 0.003 -26.299 -6.240
xa1[T.10]:xo8 -25.2058 24.564 -1.026 0.317 -76.289 25.877
xa1[T.11]:xo8 -1.5201 6.197 -0.245 0.809 -14.408 11.368
xa2[T.1]:xo8 5.3607 6.689 0.801 0.432 -8.550 19.271
xa2[T.2]:xo8 -9.8431 11.689 -0.842 0.409 -34.152 14.465
xa2[T.3]:xo8 0.1062 1.753 0.061 0.952 -3.538 3.751
xa2[T.4]:xo8 4.1629 9.100 0.457 0.652 -14.762 23.088
xa2[T.5]:xo8 -21.9346 15.142 -1.449 0.162 -53.423 9.554
xa2[T.6]:xo8 -4.7922 5.144 -0.932 0.362 -15.489 5.905
xa2[T.7]:xo8 4.5072 5.752 0.784 0.442 -7.454 16.468
xa2[T.8]:xo8 0.2406 17.323 0.014 0.989 -35.784 36.265
xa2[T.9]:xo8 -16.2694 4.823 -3.373 0.003 -26.299 -6.240
xa2[T.10]:xo8 -25.2058 24.564 -1.026 0.317 -76.289 25.877
xa2[T.11]:xo8 -1.5201 6.197 -0.245 0.809 -14.408 11.368
xa3[T.latin]:xo8 -9.8433 13.076 -0.753 0.460 -37.037 17.350
xa3[T.pop]:xo8 -14.9232 12.502 -1.194 0.246 -40.922 11.075
xa3[T.r&b]:xo8 3.9901 9.514 0.419 0.679 -15.795 23.776
xa3[T.rap]:xo8 1.1183 6.066 0.184 0.856 -11.497 13.733
xa3[T.rock]:xo8 10.8462 13.321 0.814 0.425 -16.857 38.549
xo9 43.7944 16.930 2.587 0.017 8.586 79.002
xa1[T.1]:xo9 -29.8904 12.223 -2.445 0.023 -55.311 -4.470
xa1[T.2]:xo9 -39.6396 19.206 -2.064 0.052 -79.581 0.302
xa1[T.3]:xo9 11.6814 8.235 1.418 0.171 -5.445 28.807
xa1[T.4]:xo9 -34.7551 13.556 -2.564 0.018 -62.947 -6.563
xa1[T.5]:xo9 -13.9660 10.674 -1.308 0.205 -36.163 8.231
xa1[T.6]:xo9 -32.5316 12.862 -2.529 0.019 -59.279 -5.785
xa1[T.7]:xo9 -30.5116 12.352 -2.470 0.022 -56.199 -4.824
xa1[T.8]:xo9 -38.0098 20.160 -1.885 0.073 -79.935 3.915
xa1[T.9]:xo9 -39.1292 17.241 -2.270 0.034 -74.984 -3.274
xa1[T.10]:xo9 -30.5285 11.361 -2.687 0.014 -54.156 -6.901
xa1[T.11]:xo9 -39.2545 14.754 -2.661 0.015 -69.936 -8.573
xa2[T.1]:xo9 -29.8904 12.223 -2.445 0.023 -55.311 -4.470
xa2[T.2]:xo9 -39.6396 19.206 -2.064 0.052 -79.581 0.302
xa2[T.3]:xo9 11.6814 8.235 1.418 0.171 -5.445 28.807
xa2[T.4]:xo9 -34.7551 13.556 -2.564 0.018 -62.947 -6.563
xa2[T.5]:xo9 -13.9660 10.674 -1.308 0.205 -36.163 8.231
xa2[T.6]:xo9 -32.5316 12.862 -2.529 0.019 -59.279 -5.785
xa2[T.7]:xo9 -30.5116 12.352 -2.470 0.022 -56.199 -4.824
xa2[T.8]:xo9 -38.0098 20.160 -1.885 0.073 -79.935 3.915
xa2[T.9]:xo9 -39.1292 17.241 -2.270 0.034 -74.984 -3.274
xa2[T.10]:xo9 -30.5285 11.361 -2.687 0.014 -54.156 -6.901
xa2[T.11]:xo9 -39.2545 14.754 -2.661 0.015 -69.936 -8.573
xa3[T.latin]:xo9 1.1565 9.162 0.126 0.901 -17.896 20.209
xa3[T.pop]:xo9 -25.6628 13.955 -1.839 0.080 -54.683 3.358
xa3[T.r&b]:xo9 25.3968 14.122 1.798 0.087 -3.972 54.765
xa3[T.rap]:xo9 16.9212 8.747 1.934 0.067 -1.270 35.112
xa3[T.rock]:xo9 17.4859 9.867 1.772 0.091 -3.035 38.006
xo10 -4.6984 6.947 -0.676 0.506 -19.146 9.750
xa1[T.1]:xo10 -3.1762 4.994 -0.636 0.532 -13.561 7.209
xa1[T.2]:xo10 16.7863 10.624 1.580 0.129 -5.308 38.881
xa1[T.3]:xo10 14.2603 9.273 1.538 0.139 -5.024 33.544
xa1[T.4]:xo10 -2.7096 6.485 -0.418 0.680 -16.196 10.777
xa1[T.5]:xo10 -37.7488 25.579 -1.476 0.155 -90.942 15.445
xa1[T.6]:xo10 -1.6730 9.279 -0.180 0.859 -20.970 17.624
xa1[T.7]:xo10 -5.1968 5.337 -0.974 0.341 -16.295 5.902
xa1[T.8]:xo10 -20.3232 6.452 -3.150 0.005 -33.741 -6.905
xa1[T.9]:xo10 0.5120 13.914 0.037 0.971 -28.424 29.448
xa1[T.10]:xo10 -14.2881 12.239 -1.167 0.256 -39.741 11.164
xa1[T.11]:xo10 -3.1051 4.237 -0.733 0.472 -11.916 5.706
xa2[T.1]:xo10 -3.1762 4.994 -0.636 0.532 -13.561 7.209
xa2[T.2]:xo10 16.7863 10.624 1.580 0.129 -5.308 38.881
xa2[T.3]:xo10 14.2603 9.273 1.538 0.139 -5.024 33.544
xa2[T.4]:xo10 -2.7096 6.485 -0.418 0.680 -16.196 10.777
xa2[T.5]:xo10 -37.7488 25.579 -1.476 0.155 -90.942 15.445
xa2[T.6]:xo10 -1.6730 9.279 -0.180 0.859 -20.970 17.624
xa2[T.7]:xo10 -5.1968 5.337 -0.974 0.341 -16.295 5.902
xa2[T.8]:xo10 -20.3232 6.452 -3.150 0.005 -33.741 -6.905
xa2[T.9]:xo10 0.5120 13.914 0.037 0.971 -28.424 29.448
xa2[T.10]:xo10 -14.2881 12.239 -1.167 0.256 -39.741 11.164
xa2[T.11]:xo10 -3.1051 4.237 -0.733 0.472 -11.916 5.706
xa3[T.latin]:xo10 15.3685 9.428 1.630 0.118 -4.239 34.976
xa3[T.pop]:xo10 -32.5203 20.860 -1.559 0.134 -75.901 10.860
xa3[T.r&b]:xo10 15.1484 17.200 0.881 0.388 -20.620 50.917
xa3[T.rap]:xo10 21.7511 10.272 2.118 0.046 0.389 43.113
xa3[T.rock]:xo10 -15.0375 9.644 -1.559 0.134 -35.093 5.018
np.power(xo1, 2) 22.7869 17.270 1.319 0.201 -13.129 58.702
xa1[T.1]:np.power(xo1, 2) -9.1322 8.669 -1.053 0.304 -27.160 8.895
xa1[T.2]:np.power(xo1, 2) -25.1314 15.209 -1.652 0.113 -56.761 6.498
xa1[T.3]:np.power(xo1, 2) -2.6175 3.178 -0.824 0.419 -9.227 3.992
xa1[T.4]:np.power(xo1, 2) -19.5349 12.850 -1.520 0.143 -46.259 7.189
xa1[T.5]:np.power(xo1, 2) -22.2275 34.121 -0.651 0.522 -93.186 48.731
xa1[T.6]:np.power(xo1, 2) -10.0422 10.701 -0.938 0.359 -32.296 12.212
xa1[T.7]:np.power(xo1, 2) -6.3028 11.870 -0.531 0.601 -30.989 18.383
xa1[T.8]:np.power(xo1, 2) -17.0924 17.494 -0.977 0.340 -53.474 19.289
xa1[T.9]:np.power(xo1, 2) 3.4357 4.278 0.803 0.431 -5.460 12.331
xa1[T.10]:np.power(xo1, 2) -27.8774 16.808 -1.659 0.112 -62.832 7.077
xa1[T.11]:np.power(xo1, 2) -13.3935 10.411 -1.286 0.212 -35.044 8.257
xa2[T.1]:np.power(xo1, 2) -9.1322 8.669 -1.053 0.304 -27.160 8.895
xa2[T.2]:np.power(xo1, 2) -25.1314 15.209 -1.652 0.113 -56.761 6.498
xa2[T.3]:np.power(xo1, 2) -2.6175 3.178 -0.824 0.419 -9.227 3.992
xa2[T.4]:np.power(xo1, 2) -19.5349 12.850 -1.520 0.143 -46.259 7.189
xa2[T.5]:np.power(xo1, 2) -22.2275 34.121 -0.651 0.522 -93.186 48.731
xa2[T.6]:np.power(xo1, 2) -10.0422 10.701 -0.938 0.359 -32.296 12.212
xa2[T.7]:np.power(xo1, 2) -6.3028 11.870 -0.531 0.601 -30.989 18.383
xa2[T.8]:np.power(xo1, 2) -17.0924 17.494 -0.977 0.340 -53.474 19.289
xa2[T.9]:np.power(xo1, 2) 3.4357 4.278 0.803 0.431 -5.460 12.331
xa2[T.10]:np.power(xo1, 2) -27.8774 16.808 -1.659 0.112 -62.832 7.077
xa2[T.11]:np.power(xo1, 2) -13.3935 10.411 -1.286 0.212 -35.044 8.257
xa3[T.latin]:np.power(xo1, 2) 18.4317 14.226 1.296 0.209 -11.152 48.016
xa3[T.pop]:np.power(xo1, 2) 7.6675 16.418 0.467 0.645 -26.476 41.811
xa3[T.r&b]:np.power(xo1, 2) -3.5473 8.254 -0.430 0.672 -20.713 13.618
xa3[T.rap]:np.power(xo1, 2) 5.9358 8.493 0.699 0.492 -11.727 23.599
xa3[T.rock]:np.power(xo1, 2) -11.2539 10.379 -1.084 0.291 -32.839 10.331
np.power(xo2, 2) 51.1516 24.724 2.069 0.051 -0.265 102.568
xa1[T.1]:np.power(xo2, 2) -28.6568 13.587 -2.109 0.047 -56.913 -0.400
xa1[T.2]:np.power(xo2, 2) -6.9762 9.053 -0.771 0.450 -25.803 11.850
xa1[T.3]:np.power(xo2, 2) 2.6927 6.475 0.416 0.682 -10.773 16.158
xa1[T.4]:np.power(xo2, 2) -31.7028 16.120 -1.967 0.063 -65.226 1.820
xa1[T.5]:np.power(xo2, 2) -15.7052 9.741 -1.612 0.122 -35.963 4.552
xa1[T.6]:np.power(xo2, 2) -19.4313 11.088 -1.753 0.094 -42.489 3.627
xa1[T.7]:np.power(xo2, 2) -22.6355 12.232 -1.851 0.078 -48.073 2.802
xa1[T.8]:np.power(xo2, 2) -41.2740 21.284 -1.939 0.066 -85.537 2.989
xa1[T.9]:np.power(xo2, 2) -31.5486 16.930 -1.863 0.076 -66.757 3.660
xa1[T.10]:np.power(xo2, 2) -56.7698 34.472 -1.647 0.114 -128.457 14.918
xa1[T.11]:np.power(xo2, 2) -17.4973 11.508 -1.520 0.143 -41.429 6.434
xa2[T.1]:np.power(xo2, 2) -28.6568 13.587 -2.109 0.047 -56.913 -0.400
xa2[T.2]:np.power(xo2, 2) -6.9762 9.053 -0.771 0.450 -25.803 11.850
xa2[T.3]:np.power(xo2, 2) 2.6927 6.475 0.416 0.682 -10.773 16.158
xa2[T.4]:np.power(xo2, 2) -31.7028 16.120 -1.967 0.063 -65.226 1.820
xa2[T.5]:np.power(xo2, 2) -15.7052 9.741 -1.612 0.122 -35.963 4.552
xa2[T.6]:np.power(xo2, 2) -19.4313 11.088 -1.753 0.094 -42.489 3.627
xa2[T.7]:np.power(xo2, 2) -22.6355 12.232 -1.851 0.078 -48.073 2.802
xa2[T.8]:np.power(xo2, 2) -41.2740 21.284 -1.939 0.066 -85.537 2.989
xa2[T.9]:np.power(xo2, 2) -31.5486 16.930 -1.863 0.076 -66.757 3.660
xa2[T.10]:np.power(xo2, 2) -56.7698 34.472 -1.647 0.114 -128.457 14.918
xa2[T.11]:np.power(xo2, 2) -17.4973 11.508 -1.520 0.143 -41.429 6.434
xa3[T.latin]:np.power(xo2, 2) -30.5226 14.052 -2.172 0.041 -59.745 -1.300
xa3[T.pop]:np.power(xo2, 2) 87.2739 35.174 2.481 0.022 14.126 160.422
xa3[T.r&b]:np.power(xo2, 2) -7.5034 7.317 -1.025 0.317 -22.721 7.714
xa3[T.rap]:np.power(xo2, 2) -6.1757 7.521 -0.821 0.421 -21.817 9.466
xa3[T.rock]:np.power(xo2, 2) -4.2898 8.980 -0.478 0.638 -22.965 14.385
np.power(xo3, 2) -8.0702 9.796 -0.824 0.419 -28.441 12.301
xa1[T.1]:np.power(xo3, 2) -2.6518 5.194 -0.511 0.615 -13.453 8.150
xa1[T.2]:np.power(xo3, 2) -3.6277 4.816 -0.753 0.460 -13.642 6.387
xa1[T.3]:np.power(xo3, 2) -10.0411 5.683 -1.767 0.092 -21.860 1.777
xa1[T.4]:np.power(xo3, 2) -5.0832 5.950 -0.854 0.403 -17.456 7.290
xa1[T.5]:np.power(xo3, 2) -30.3908 24.691 -1.231 0.232 -81.739 20.957
xa1[T.6]:np.power(xo3, 2) 3.5378 5.930 0.597 0.557 -8.795 15.871
xa1[T.7]:np.power(xo3, 2) 0.0455 3.157 0.014 0.989 -6.520 6.611
xa1[T.8]:np.power(xo3, 2) 29.3789 10.338 2.842 0.010 7.880 50.878
xa1[T.9]:np.power(xo3, 2) -4.2921 3.496 -1.228 0.233 -11.562 2.978
xa1[T.10]:np.power(xo3, 2) 36.7507 16.632 2.210 0.038 2.162 71.339
xa1[T.11]:np.power(xo3, 2) -13.0223 5.541 -2.350 0.029 -24.546 -1.499
xa2[T.1]:np.power(xo3, 2) -2.6518 5.194 -0.511 0.615 -13.453 8.150
xa2[T.2]:np.power(xo3, 2) -3.6277 4.816 -0.753 0.460 -13.642 6.387
xa2[T.3]:np.power(xo3, 2) -10.0411 5.683 -1.767 0.092 -21.860 1.777
xa2[T.4]:np.power(xo3, 2) -5.0832 5.950 -0.854 0.403 -17.456 7.290
xa2[T.5]:np.power(xo3, 2) -30.3908 24.691 -1.231 0.232 -81.739 20.957
xa2[T.6]:np.power(xo3, 2) 3.5378 5.930 0.597 0.557 -8.795 15.871
xa2[T.7]:np.power(xo3, 2) 0.0455 3.157 0.014 0.989 -6.520 6.611
xa2[T.8]:np.power(xo3, 2) 29.3789 10.338 2.842 0.010 7.880 50.878
xa2[T.9]:np.power(xo3, 2) -4.2921 3.496 -1.228 0.233 -11.562 2.978
xa2[T.10]:np.power(xo3, 2) 36.7507 16.632 2.210 0.038 2.162 71.339
xa2[T.11]:np.power(xo3, 2) -13.0223 5.541 -2.350 0.029 -24.546 -1.499
xa3[T.latin]:np.power(xo3, 2) 17.2510 10.146 1.700 0.104 -3.848 38.350
xa3[T.pop]:np.power(xo3, 2) -46.7473 36.354 -1.286 0.212 -122.349 28.855
xa3[T.r&b]:np.power(xo3, 2) 16.0195 10.689 1.499 0.149 -6.209 38.248
xa3[T.rap]:np.power(xo3, 2) 17.9317 9.655 1.857 0.077 -2.148 38.011
xa3[T.rock]:np.power(xo3, 2) -12.5841 5.915 -2.127 0.045 -24.886 -0.282
np.power(xo4, 2) 23.0957 18.501 1.248 0.226 -15.379 61.570
xa1[T.1]:np.power(xo4, 2) 4.3311 3.475 1.246 0.226 -2.896 11.559
xa1[T.2]:np.power(xo4, 2) -30.8944 17.910 -1.725 0.099 -68.141 6.352
xa1[T.3]:np.power(xo4, 2) 5.0556 4.600 1.099 0.284 -4.510 14.622
xa1[T.4]:np.power(xo4, 2) 2.5441 6.614 0.385 0.704 -11.209 16.298
xa1[T.5]:np.power(xo4, 2) -23.6170 24.716 -0.956 0.350 -75.017 27.783
xa1[T.6]:np.power(xo4, 2) 3.4758 2.603 1.335 0.196 -1.938 8.890
xa1[T.7]:np.power(xo4, 2) 4.5259 3.038 1.490 0.151 -1.791 10.843
xa1[T.8]:np.power(xo4, 2) -7.3912 14.365 -0.515 0.612 -37.265 22.482
xa1[T.9]:np.power(xo4, 2) 23.8945 8.757 2.729 0.013 5.684 42.105
xa1[T.10]:np.power(xo4, 2) 4.0123 4.810 0.834 0.414 -5.990 14.015
xa1[T.11]:np.power(xo4, 2) 4.6767 2.925 1.599 0.125 -1.406 10.759
xa2[T.1]:np.power(xo4, 2) 4.3311 3.475 1.246 0.226 -2.896 11.559
xa2[T.2]:np.power(xo4, 2) -30.8944 17.910 -1.725 0.099 -68.141 6.352
xa2[T.3]:np.power(xo4, 2) 5.0556 4.600 1.099 0.284 -4.510 14.622
xa2[T.4]:np.power(xo4, 2) 2.5441 6.614 0.385 0.704 -11.209 16.298
xa2[T.5]:np.power(xo4, 2) -23.6170 24.716 -0.956 0.350 -75.017 27.783
xa2[T.6]:np.power(xo4, 2) 3.4758 2.603 1.335 0.196 -1.938 8.890
xa2[T.7]:np.power(xo4, 2) 4.5259 3.038 1.490 0.151 -1.791 10.843
xa2[T.8]:np.power(xo4, 2) -7.3912 14.365 -0.515 0.612 -37.265 22.482
xa2[T.9]:np.power(xo4, 2) 23.8945 8.757 2.729 0.013 5.684 42.105
xa2[T.10]:np.power(xo4, 2) 4.0123 4.810 0.834 0.414 -5.990 14.015
xa2[T.11]:np.power(xo4, 2) 4.6767 2.925 1.599 0.125 -1.406 10.759
xa3[T.latin]:np.power(xo4, 2) -15.9690 11.603 -1.376 0.183 -40.098 8.160
xa3[T.pop]:np.power(xo4, 2) 20.4396 33.021 0.619 0.543 -48.232 89.111
xa3[T.r&b]:np.power(xo4, 2) -28.4160 16.373 -1.736 0.097 -62.465 5.633
xa3[T.rap]:np.power(xo4, 2) -30.1318 20.100 -1.499 0.149 -71.933 11.669
xa3[T.rock]:np.power(xo4, 2) 78.1777 44.224 1.768 0.092 -13.791 170.147
np.power(xo5, 2) 121.8438 58.682 2.076 0.050 -0.192 243.880
xa1[T.1]:np.power(xo5, 2) -70.3206 31.223 -2.252 0.035 -135.253 -5.388
xa1[T.2]:np.power(xo5, 2) -51.8574 19.428 -2.669 0.014 -92.261 -11.454
xa1[T.3]:np.power(xo5, 2) -61.8595 28.544 -2.167 0.042 -121.220 -2.499
xa1[T.4]:np.power(xo5, 2) -54.0857 28.235 -1.916 0.069 -112.803 4.632
xa1[T.5]:np.power(xo5, 2) -78.5360 34.882 -2.251 0.035 -151.078 -5.994
xa1[T.6]:np.power(xo5, 2) -68.3001 30.275 -2.256 0.035 -131.261 -5.339
xa1[T.7]:np.power(xo5, 2) -72.8173 32.010 -2.275 0.034 -139.386 -6.248
xa1[T.8]:np.power(xo5, 2) -76.5608 32.547 -2.352 0.028 -144.245 -8.877
xa1[T.9]:np.power(xo5, 2) -101.9366 43.415 -2.348 0.029 -192.222 -11.651
xa1[T.10]:np.power(xo5, 2) -77.4584 34.483 -2.246 0.036 -149.170 -5.747
xa1[T.11]:np.power(xo5, 2) -50.5576 28.571 -1.770 0.091 -109.974 8.859
xa2[T.1]:np.power(xo5, 2) -70.3206 31.223 -2.252 0.035 -135.253 -5.388
xa2[T.2]:np.power(xo5, 2) -51.8574 19.428 -2.669 0.014 -92.261 -11.454
xa2[T.3]:np.power(xo5, 2) -61.8595 28.544 -2.167 0.042 -121.220 -2.499
xa2[T.4]:np.power(xo5, 2) -54.0857 28.235 -1.916 0.069 -112.803 4.632
xa2[T.5]:np.power(xo5, 2) -78.5360 34.882 -2.251 0.035 -151.078 -5.994
xa2[T.6]:np.power(xo5, 2) -68.3001 30.275 -2.256 0.035 -131.261 -5.339
xa2[T.7]:np.power(xo5, 2) -72.8173 32.010 -2.275 0.034 -139.386 -6.248
xa2[T.8]:np.power(xo5, 2) -76.5608 32.547 -2.352 0.028 -144.245 -8.877
xa2[T.9]:np.power(xo5, 2) -101.9366 43.415 -2.348 0.029 -192.222 -11.651
xa2[T.10]:np.power(xo5, 2) -77.4584 34.483 -2.246 0.036 -149.170 -5.747
xa2[T.11]:np.power(xo5, 2) -50.5576 28.571 -1.770 0.091 -109.974 8.859
xa3[T.latin]:np.power(xo5, 2) 2.2426 11.211 0.200 0.843 -21.071 25.556
xa3[T.pop]:np.power(xo5, 2) 55.8339 67.796 0.824 0.419 -85.156 196.824
xa3[T.r&b]:np.power(xo5, 2) 18.1788 14.636 1.242 0.228 -12.259 48.616
xa3[T.rap]:np.power(xo5, 2) 14.0792 11.005 1.279 0.215 -8.807 36.965
xa3[T.rock]:np.power(xo5, 2) 29.4353 16.535 1.780 0.090 -4.951 63.821
np.power(xo6, 2) -12.0823 9.083 -1.330 0.198 -30.972 6.807
xa1[T.1]:np.power(xo6, 2) 2.3554 17.174 0.137 0.892 -33.359 38.070
xa1[T.2]:np.power(xo6, 2) 16.9177 7.739 2.186 0.040 0.825 33.011
xa1[T.3]:np.power(xo6, 2) 1.1312 0.482 2.344 0.029 0.128 2.135
xa1[T.4]:np.power(xo6, 2) -66.2693 46.414 -1.428 0.168 -162.793 30.255
xa1[T.5]:np.power(xo6, 2) -10.1098 85.733 -0.118 0.907 -188.401 168.181
xa1[T.6]:np.power(xo6, 2) -38.9886 50.521 -0.772 0.449 -144.053 66.076
xa1[T.7]:np.power(xo6, 2) -195.2582 101.036 -1.933 0.067 -405.375 14.858
xa1[T.8]:np.power(xo6, 2) 29.8715 14.985 1.993 0.059 -1.293 61.036
xa1[T.9]:np.power(xo6, 2) -83.9641 34.686 -2.421 0.025 -156.098 -11.831
xa1[T.10]:np.power(xo6, 2) 17.0652 7.498 2.276 0.033 1.473 32.658
xa1[T.11]:np.power(xo6, 2) 179.8595 87.896 2.046 0.053 -2.931 362.650
xa2[T.1]:np.power(xo6, 2) 2.3554 17.174 0.137 0.892 -33.359 38.070
xa2[T.2]:np.power(xo6, 2) 16.9177 7.739 2.186 0.040 0.825 33.011
xa2[T.3]:np.power(xo6, 2) 1.1312 0.482 2.344 0.029 0.128 2.135
xa2[T.4]:np.power(xo6, 2) -66.2693 46.414 -1.428 0.168 -162.793 30.255
xa2[T.5]:np.power(xo6, 2) -10.1098 85.733 -0.118 0.907 -188.401 168.181
xa2[T.6]:np.power(xo6, 2) -38.9886 50.521 -0.772 0.449 -144.053 66.076
xa2[T.7]:np.power(xo6, 2) -195.2582 101.036 -1.933 0.067 -405.375 14.858
xa2[T.8]:np.power(xo6, 2) 29.8715 14.985 1.993 0.059 -1.293 61.036
xa2[T.9]:np.power(xo6, 2) -83.9641 34.686 -2.421 0.025 -156.098 -11.831
xa2[T.10]:np.power(xo6, 2) 17.0652 7.498 2.276 0.033 1.473 32.658
xa2[T.11]:np.power(xo6, 2) 179.8595 87.896 2.046 0.053 -2.931 362.650
xa3[T.latin]:np.power(xo6, 2) 197.8834 73.060 2.709 0.013 45.948 349.819
xa3[T.pop]:np.power(xo6, 2) 3.8618 10.463 0.369 0.716 -17.897 25.620
xa3[T.r&b]:np.power(xo6, 2) -46.5173 50.121 -0.928 0.364 -150.751 57.716
xa3[T.rap]:np.power(xo6, 2) 39.1669 25.732 1.522 0.143 -14.347 92.680
xa3[T.rock]:np.power(xo6, 2) -44.5284 20.632 -2.158 0.043 -87.435 -1.622
np.power(xo7, 2) -6.1245 7.395 -0.828 0.417 -21.503 9.254
xa1[T.1]:np.power(xo7, 2) 9.3967 6.188 1.518 0.144 -3.473 22.266
xa1[T.2]:np.power(xo7, 2) -2.7903 5.177 -0.539 0.596 -13.557 7.976
xa1[T.3]:np.power(xo7, 2) 8.9033 4.512 1.973 0.062 -0.480 18.287
xa1[T.4]:np.power(xo7, 2) -4.9255 9.241 -0.533 0.600 -24.143 14.292
xa1[T.5]:np.power(xo7, 2) 12.7906 7.795 1.641 0.116 -3.420 29.001
xa1[T.6]:np.power(xo7, 2) 12.5521 5.956 2.108 0.047 0.167 24.937
xa1[T.7]:np.power(xo7, 2) 7.9564 6.717 1.184 0.249 -6.013 21.926
xa1[T.8]:np.power(xo7, 2) 21.9126 13.446 1.630 0.118 -6.050 49.875
xa1[T.9]:np.power(xo7, 2) -3.2115 3.845 -0.835 0.413 -11.208 4.785
xa1[T.10]:np.power(xo7, 2) -8.0086 9.554 -0.838 0.411 -27.877 11.860
xa1[T.11]:np.power(xo7, 2) 15.4619 9.217 1.678 0.108 -3.706 34.629
xa2[T.1]:np.power(xo7, 2) 9.3967 6.188 1.518 0.144 -3.473 22.266
xa2[T.2]:np.power(xo7, 2) -2.7903 5.177 -0.539 0.596 -13.557 7.976
xa2[T.3]:np.power(xo7, 2) 8.9033 4.512 1.973 0.062 -0.480 18.287
xa2[T.4]:np.power(xo7, 2) -4.9255 9.241 -0.533 0.600 -24.143 14.292
xa2[T.5]:np.power(xo7, 2) 12.7906 7.795 1.641 0.116 -3.420 29.001
xa2[T.6]:np.power(xo7, 2) 12.5521 5.956 2.108 0.047 0.167 24.937
xa2[T.7]:np.power(xo7, 2) 7.9564 6.717 1.184 0.249 -6.013 21.926
xa2[T.8]:np.power(xo7, 2) 21.9126 13.446 1.630 0.118 -6.050 49.875
xa2[T.9]:np.power(xo7, 2) -3.2115 3.845 -0.835 0.413 -11.208 4.785
xa2[T.10]:np.power(xo7, 2) -8.0086 9.554 -0.838 0.411 -27.877 11.860
xa2[T.11]:np.power(xo7, 2) 15.4619 9.217 1.678 0.108 -3.706 34.629
xa3[T.latin]:np.power(xo7, 2) -19.9520 11.981 -1.665 0.111 -44.867 4.964
xa3[T.pop]:np.power(xo7, 2) 9.3972 21.830 0.430 0.671 -36.001 54.796
xa3[T.r&b]:np.power(xo7, 2) -10.1728 6.294 -1.616 0.121 -23.263 2.917
xa3[T.rap]:np.power(xo7, 2) -12.2779 8.025 -1.530 0.141 -28.966 4.411
xa3[T.rock]:np.power(xo7, 2) 17.3608 8.033 2.161 0.042 0.654 34.067
np.power(xo8, 2) -5.1836 12.631 -0.410 0.686 -31.451 21.084
xa1[T.1]:np.power(xo8, 2) 5.1319 6.104 0.841 0.410 -7.561 17.825
xa1[T.2]:np.power(xo8, 2) 6.6648 7.628 0.874 0.392 -9.199 22.528
xa1[T.3]:np.power(xo8, 2) -16.3328 5.944 -2.748 0.012 -28.693 -3.973
xa1[T.4]:np.power(xo8, 2) 22.1653 11.024 2.011 0.057 -0.760 45.091
xa1[T.5]:np.power(xo8, 2) 8.6172 6.287 1.371 0.185 -4.458 21.692
xa1[T.6]:np.power(xo8, 2) 10.4962 10.557 0.994 0.331 -11.459 32.451
xa1[T.7]:np.power(xo8, 2) 4.1816 6.014 0.695 0.495 -8.326 16.689
xa1[T.8]:np.power(xo8, 2) 22.6705 16.480 1.376 0.183 -11.600 56.942
xa1[T.9]:np.power(xo8, 2) -7.1939 4.172 -1.724 0.099 -15.871 1.483
xa1[T.10]:np.power(xo8, 2) 1.5853 8.209 0.193 0.849 -15.486 18.656
xa1[T.11]:np.power(xo8, 2) -5.9582 7.257 -0.821 0.421 -21.050 9.134
xa2[T.1]:np.power(xo8, 2) 5.1319 6.104 0.841 0.410 -7.561 17.825
xa2[T.2]:np.power(xo8, 2) 6.6648 7.628 0.874 0.392 -9.199 22.528
xa2[T.3]:np.power(xo8, 2) -16.3328 5.944 -2.748 0.012 -28.693 -3.973
xa2[T.4]:np.power(xo8, 2) 22.1653 11.024 2.011 0.057 -0.760 45.091
xa2[T.5]:np.power(xo8, 2) 8.6172 6.287 1.371 0.185 -4.458 21.692
xa2[T.6]:np.power(xo8, 2) 10.4962 10.557 0.994 0.331 -11.459 32.451
xa2[T.7]:np.power(xo8, 2) 4.1816 6.014 0.695 0.495 -8.326 16.689
xa2[T.8]:np.power(xo8, 2) 22.6705 16.480 1.376 0.183 -11.600 56.942
xa2[T.9]:np.power(xo8, 2) -7.1939 4.172 -1.724 0.099 -15.871 1.483
xa2[T.10]:np.power(xo8, 2) 1.5853 8.209 0.193 0.849 -15.486 18.656
xa2[T.11]:np.power(xo8, 2) -5.9582 7.257 -0.821 0.421 -21.050 9.134
xa3[T.latin]:np.power(xo8, 2) -2.8618 13.646 -0.210 0.836 -31.240 25.517
xa3[T.pop]:np.power(xo8, 2) 25.7241 10.993 2.340 0.029 2.863 48.585
xa3[T.r&b]:np.power(xo8, 2) -13.2994 4.745 -2.803 0.011 -23.167 -3.432
xa3[T.rap]:np.power(xo8, 2) -10.7373 8.189 -1.311 0.204 -27.766 6.292
xa3[T.rock]:np.power(xo8, 2) -1.6672 5.832 -0.286 0.778 -13.795 10.460
np.power(xo9, 2) -55.4894 36.597 -1.516 0.144 -131.597 20.618
xa1[T.1]:np.power(xo9, 2) 32.7894 21.305 1.539 0.139 -11.516 77.095
xa1[T.2]:np.power(xo9, 2) 37.8653 24.670 1.535 0.140 -13.439 89.169
xa1[T.3]:np.power(xo9, 2) 3.0690 8.834 0.347 0.732 -15.301 21.439
xa1[T.4]:np.power(xo9, 2) 32.0929 23.453 1.368 0.186 -16.680 80.866
xa1[T.5]:np.power(xo9, 2) 35.2401 23.911 1.474 0.155 -14.486 84.966
xa1[T.6]:np.power(xo9, 2) 38.3499 26.724 1.435 0.166 -17.225 93.925
xa1[T.7]:np.power(xo9, 2) 34.2193 19.494 1.755 0.094 -6.322 74.760
xa1[T.8]:np.power(xo9, 2) 20.6674 27.010 0.765 0.453 -35.504 76.838
xa1[T.9]:np.power(xo9, 2) 41.9548 28.246 1.485 0.152 -16.787 100.696
xa1[T.10]:np.power(xo9, 2) 32.2499 25.069 1.286 0.212 -19.884 84.384
xa1[T.11]:np.power(xo9, 2) 37.5498 23.586 1.592 0.126 -11.499 86.599
xa2[T.1]:np.power(xo9, 2) 32.7894 21.305 1.539 0.139 -11.516 77.095
xa2[T.2]:np.power(xo9, 2) 37.8653 24.670 1.535 0.140 -13.439 89.169
xa2[T.3]:np.power(xo9, 2) 3.0690 8.834 0.347 0.732 -15.301 21.439
xa2[T.4]:np.power(xo9, 2) 32.0929 23.453 1.368 0.186 -16.680 80.866
xa2[T.5]:np.power(xo9, 2) 35.2401 23.911 1.474 0.155 -14.486 84.966
xa2[T.6]:np.power(xo9, 2) 38.3499 26.724 1.435 0.166 -17.225 93.925
xa2[T.7]:np.power(xo9, 2) 34.2193 19.494 1.755 0.094 -6.322 74.760
xa2[T.8]:np.power(xo9, 2) 20.6674 27.010 0.765 0.453 -35.504 76.838
xa2[T.9]:np.power(xo9, 2) 41.9548 28.246 1.485 0.152 -16.787 100.696
xa2[T.10]:np.power(xo9, 2) 32.2499 25.069 1.286 0.212 -19.884 84.384
xa2[T.11]:np.power(xo9, 2) 37.5498 23.586 1.592 0.126 -11.499 86.599
xa3[T.latin]:np.power(xo9, 2) -23.5264 13.765 -1.709 0.102 -52.153 5.100
xa3[T.pop]:np.power(xo9, 2) 18.6704 17.853 1.046 0.308 -18.456 55.797
xa3[T.r&b]:np.power(xo9, 2) -18.5976 14.180 -1.312 0.204 -48.086 10.891
xa3[T.rap]:np.power(xo9, 2) -10.9162 6.678 -1.635 0.117 -24.805 2.972
xa3[T.rock]:np.power(xo9, 2) -27.1090 14.889 -1.821 0.083 -58.071 3.853
np.power(xo10, 2) -3.0368 9.594 -0.317 0.755 -22.990 16.916
xa1[T.1]:np.power(xo10, 2) -0.6272 5.954 -0.105 0.917 -13.009 11.755
xa1[T.2]:np.power(xo10, 2) -29.1337 14.499 -2.009 0.058 -59.287 1.020
xa1[T.3]:np.power(xo10, 2) 5.8041 4.606 1.260 0.221 -3.774 15.382
xa1[T.4]:np.power(xo10, 2) 7.8301 11.715 0.668 0.511 -16.533 32.193
xa1[T.5]:np.power(xo10, 2) 10.4271 11.112 0.938 0.359 -12.681 33.535
xa1[T.6]:np.power(xo10, 2) -3.7065 5.604 -0.661 0.516 -15.360 7.947
xa1[T.7]:np.power(xo10, 2) -1.7966 6.050 -0.297 0.769 -14.378 10.784
xa1[T.8]:np.power(xo10, 2) 3.8130 7.896 0.483 0.634 -12.609 20.235
xa1[T.9]:np.power(xo10, 2) 35.7296 21.718 1.645 0.115 -9.436 80.896
xa1[T.10]:np.power(xo10, 2) -14.5261 9.025 -1.609 0.122 -33.296 4.243
xa1[T.11]:np.power(xo10, 2) -1.1926 6.280 -0.190 0.851 -14.253 11.868
xa2[T.1]:np.power(xo10, 2) -0.6272 5.954 -0.105 0.917 -13.009 11.755
xa2[T.2]:np.power(xo10, 2) -29.1337 14.499 -2.009 0.058 -59.287 1.020
xa2[T.3]:np.power(xo10, 2) 5.8041 4.606 1.260 0.221 -3.774 15.382
xa2[T.4]:np.power(xo10, 2) 7.8301 11.715 0.668 0.511 -16.533 32.193
xa2[T.5]:np.power(xo10, 2) 10.4271 11.112 0.938 0.359 -12.681 33.535
xa2[T.6]:np.power(xo10, 2) -3.7065 5.604 -0.661 0.516 -15.360 7.947
xa2[T.7]:np.power(xo10, 2) -1.7966 6.050 -0.297 0.769 -14.378 10.784
xa2[T.8]:np.power(xo10, 2) 3.8130 7.896 0.483 0.634 -12.609 20.235
xa2[T.9]:np.power(xo10, 2) 35.7296 21.718 1.645 0.115 -9.436 80.896
xa2[T.10]:np.power(xo10, 2) -14.5261 9.025 -1.609 0.122 -33.296 4.243
xa2[T.11]:np.power(xo10, 2) -1.1926 6.280 -0.190 0.851 -14.253 11.868
xa3[T.latin]:np.power(xo10, 2) 1.9248 7.651 0.252 0.804 -13.986 17.836
xa3[T.pop]:np.power(xo10, 2) -20.8520 13.675 -1.525 0.142 -49.292 7.588
xa3[T.r&b]:np.power(xo10, 2) 7.0960 9.651 0.735 0.470 -12.974 27.166
xa3[T.rap]:np.power(xo10, 2) 3.7797 8.320 0.454 0.654 -13.522 21.081
xa3[T.rock]:np.power(xo10, 2) 18.5038 6.412 2.886 0.009 5.169 31.838
==============================================================================
Omnibus: 351.318 Durbin-Watson: 1.718
Prob(Omnibus): 0.000 Jarque-Bera (JB): 49605.422
Skew: -4.077 Prob(JB): 0.00
Kurtosis: 62.692 Cond. No. 1.46e+16
==============================================================================
Notes:
[1] Standard Errors assume that the covariance matrix of the errors is correctly specified.
[2] The input rank is higher than the number of observations.
[3] The smallest eigenvalue is 1.51e-28. This might indicate that there are
strong multicollinearity problems or that the design matrix is singular.
len(fit_07.params)
588
fit_07.pvalues < 0.05
Intercept False
xa1[T.1] False
xa1[T.2] False
xa1[T.3] True
xa1[T.4] False
...
xa3[T.latin]:np.power(xo10, 2) False
xa3[T.pop]:np.power(xo10, 2) False
xa3[T.r&b]:np.power(xo10, 2) False
xa3[T.rap]:np.power(xo10, 2) False
xa3[T.rock]:np.power(xo10, 2) True
Length: 588, dtype: bool
fit_07:
i: 588.
ii: 126.
iii:
xa1[T.3] 25.4538 (positive) xa1[T.10] 70.0216 (positive) xa2[T.3] 25.4538 (positive) xa2[T.10] 70.0216 (positive) xa1[T.3]:xo1 -11.8855 (negative) xa1[T.8]:xo1 22.094 (positive) xa2[T.3]:xo1 -11.8855 (negative) xa2[T.8]:xo1 22.094 (positive) xa3[T.rock]:xo1 -34.6541 (negative) xa1[T.11]:xo2 -36.4555 (negative) xa2[T.11]:xo2 -36.4555 (negative) xa3[T.latin]:xo2 50.8644 (positive) xa3[T.rap]:xo2 21.6763 (positive) xa1[T.10]:xo3 -60.5616 (negative) xa2[T.10]:xo3 -60.5616 (negative) xo4 120.0631 (positive) xa1[T.2]:xo4 -76.2567 (negative) xa1[T.3]:xo4 -40.8341 (negative) xa1[T.4]:xo4 -71.689 (negative) xa1[T.6]:xo4 -70.8929 (negative) xa1[T.7]:xo4 -70.2491 (negative) xa1[T.9]:xo4 -125.5201 (negative) xa1[T.11]:xo4 -73.4482 (negative) xa2[T.2]:xo4 -76.2567 (negative) xa2[T.3]:xo4 -40.8341 (negative) xa2[T.4]:xo4 -71.689 (negative) xa2[T.6]:xo4 -70.8929 (negative) xa2[T.7]:xo4 -70.2491 (negative) xa2[T.9]:xo4 -125.5201 (negative) xa2[T.11]:xo4 -73.4482 (negative) xa3[T.r&b]:xo4 15.6392 (positive) xa3[T.rock]:xo4 94.6516 (positive) xa1[T.3]:xo5 -22.0823 (negative) xa1[T.9]:xo5 37.4434 (positive) xa2[T.3]:xo5 -22.0823 (negative) xa2[T.9]:xo5 37.4434 (positive) xa1[T.2]:xo6 -97.0194 (negative) xa1[T.3]:xo6 -5.3659 (negative) xa2[T.2]:xo6 -97.0194 (negative) xa2[T.3]:xo6 -5.3659 (negative) xa3[T.latin]:xo6 -95.6508 (negative) xa3[T.r&b]:xo6 328.9448 (positive) xo7 -63.5582 (negative) xa1[T.1]:xo7 35.5831 (positive) xa1[T.4]:xo7 33.7546 (positive) xa1[T.6]:xo7 39.0314 (positive) xa1[T.7]:xo7 40.9796 (positive) xa1[T.9]:xo7 45.1725 (positive) xa1[T.10]:xo7 32.9818 (positive) xa1[T.11]:xo7 42.7014 (positive) xa2[T.1]:xo7 35.5831 (positive) xa2[T.4]:xo7 33.7546 (positive) xa2[T.6]:xo7 39.0314 (positive) xa2[T.7]:xo7 40.9796 (positive) xa2[T.9]:xo7 45.1725 (positive) xa2[T.10]:xo7 32.9818 (positive) xa2[T.11]:xo7 42.7014 (positive) xa1[T.9]:xo8 -16.2694 (negative) xa2[T.9]:xo8 -16.2694 (negative) xo9 43.7944 (positive) xa1[T.1]:xo9 -29.8904 (negative) xa1[T.4]:xo9 -34.7551 (negative) xa1[T.6]:xo9 -32.5316 (negative) xa1[T.7]:xo9 -30.5116 (negative) xa1[T.9]:xo9 -39.1292 (negative) xa1[T.10]:xo9 -30.5285 (negative) xa1[T.11]:xo9 -39.2545 (negative) xa2[T.1]:xo9 -29.8904 (negative) xa2[T.4]:xo9 -34.7551 (negative) xa2[T.6]:xo9 -32.5316 (negative) xa2[T.7]:xo9 -30.5116 (negative) xa2[T.9]:xo9 -39.1292 (negative) xa2[T.10]:xo9 -30.5285 (negative) xa2[T.11]:xo9 -39.2545 (negative) xa1[T.8]:xo10 -20.3232 (negative) xa2[T.8]:xo10 -20.3232 (negative) xa3[T.rap]:xo10 21.7511 (positive) xa1[T.1]:np.power(xo2,2) -28.6568 (negative) xa2[T.1]:np.power(xo2,2) -28.6568 (negative) xa3[T.latin]:np.power(xo2,2) -30.5226 (negative) xa3[T.pop]:np.power(xo2,2) 87.2739 (positive) xa1[T.8]:np.power(xo3,2) 29.3789 (positive) xa1[T.10]:np.power(xo3,2) 36.7507 (positive) xa1[T.11]:np.power(xo3,2) -13.0223 (negative) xa2[T.8]:np.power(xo3,2) 29.3789 (positive) xa2[T.10]:np.power(xo3,2) 36.7507 (positive) xa2[T.11]:np.power(xo3,2) -13.0223 (negative) xa3[T.rock]:np.power(xo3,2) -12.5841 (negative) xa1[T.9]:np.power(xo4,2) 23.8945 (positive) xa2[T.9]:np.power(xo4,2) 23.8945 (positive) xa1[T.1]:np.power(xo5,2) -70.3206 (negative) xa1[T.2]:np.power(xo5,2) -51.8574 (negative) xa1[T.3]:np.power(xo5,2) -61.8595 (negative) xa1[T.5]:np.power(xo5,2) -78.536 (negative) xa1[T.6]:np.power(xo5,2) -68.3001 (negative) xa1[T.7]:np.power(xo5,2) -72.8173 (negative) xa1[T.8]:np.power(xo5,2) -76.5608 (negative) xa1[T.9]:np.power(xo5,2) -101.9366 (negative) xa1[T.10]:np.power(xo5,2) -77.4584 (negative) xa2[T.1]:np.power(xo5,2) -70.3206 (negative) xa2[T.2]:np.power(xo5,2) -51.8574 (negative) xa2[T.3]:np.power(xo5,2) -61.8595 (negative) xa2[T.5]:np.power(xo5,2) -78.536 (negative) xa2[T.6]:np.power(xo5,2) -68.3001 (negative) xa2[T.7]:np.power(xo5,2) -72.8173 (negative) xa2[T.8]:np.power(xo5,2) -76.5608 (negative) xa2[T.9]:np.power(xo5,2) -101.9366 (negative) xa2[T.10]:np.power(xo5,2) -77.4584 (negative) xa1[T.2]:np.power(xo6,2) 16.9177 (positive) xa1[T.3]:np.power(xo6,2) 1.1312 (positive) xa1[T.9]:np.power(xo6,2) -83.9641 (negative) xa1[T.10]:np.power(xo6,2) 17.0652 (positive) xa2[T.2]:np.power(xo6,2) 16.9177 (positive) xa2[T.3]:np.power(xo6,2) 1.1312 (positive) xa2[T.9]:np.power(xo6,2) -83.9641 (negative) xa2[T.10]:np.power(xo6,2) 17.0652 (positive) xa3[T.latin]:np.power(xo6,2) 197.8834 (positive) xa3[T.rock]:np.power(xo6,2) -44.5284 (negative) xa1[T.6]:np.power(xo7,2) 12.5521 (positive) xa2[T.6]:np.power(xo7,2) 12.5521 (positive) xa3[T.rock]:np.power(xo7,2) 17.3608 (positive) xa1[T.3]:np.power(xo8,2) -16.3328 (negative) xa2[T.3]:np.power(xo8,2) -16.3328 (negative) xa3[T.pop]:np.power(xo8,2) 25.7241 (positive) xa3[T.r&b]:np.power(xo8,2) -13.2994 (negative) xa3[T.rock]:np.power(xo10,2) 18.5038 (positive)
iv: xa3[T.r&b]:xo6, xa3[T.latin]:np.power(xo6,2)
Show coefficient with statistical significant¶
def my_coefplot(mod, figsize_use=(10, 10)):
fig, ax=plt.subplots(figsize = figsize_use)
ax.errorbar(y = mod.params.index,
x = mod.params,
xerr = 2 * mod.bse,
fmt='o', color='k', ecolor='k', elinewidth=2, ms=10)
ax.axvline(x=0, linestyle='--', linewidth=3.5, color='grey')
ax.set_xlabel('coefficient value')
plt.show()
my_coefplot(fit_00)
my_coefplot(fit_01)
my_coefplot(fit_02)
my_coefplot(fit_03)
my_coefplot(fit_04)
my_coefplot(fit_05)
my_coefplot(fit_06)
my_coefplot(fit_07)
For each model that you fit you must show the performance on the training set.¶
1.For each model show the predicted vs observed figure for the training set, and the R-squared and RMSE on the training set.
def my_pred_obs_plot(mod, figsize_use=(10,4)):
df5_y = df5_new.loc[:,['y']].copy()
df5_y['fitted'] = mod.fittedvalues
fig, ax=plt.subplots(figsize = figsize_use)
sns.scatterplot(data=df5_y, x='y', y='fitted', s=50, ax=ax)
plt.show()
df5_y = df5_new.loc[:,['y']].copy()
my_pred_obs_plot(fit_00)
my_pred_obs_plot(fit_01)
my_pred_obs_plot(fit_02)
my_pred_obs_plot(fit_03)
my_pred_obs_plot(fit_04)
my_pred_obs_plot(fit_05)
my_pred_obs_plot(fit_06)
my_pred_obs_plot(fit_07)
Performance metric¶
def fit_and_assess_ols(mod_name, a_formula, the_data):
a_mod = smf.ols(formula=a_formula, data=the_data).fit()
res_dict = {'model_name':mod_name,
'model_formula':a_formula,
'num_coefs':len(a_mod.params),
'R-squared':a_mod.rsquared,
'RMSE':np.sqrt((a_mod.resid ** 2).mean())}
return pd.DataFrame(res_dict, index=[0])
fit_and_assess_ols(0,formula_list[0], df5_new)
| model_name | model_formula | num_coefs | R-squared | RMSE | |
|---|---|---|---|---|---|
| 0 | 0 | y ~ 1 | 1 | 0.0 | 2.519164 |
#apply to all formulas:
ols_results_list = []
for m in range(len(formula_list)):
ols_results_list.append(fit_and_assess_ols(m, formula_list[m], df5_new))
#combine all model results together
ols_results_df = pd.concat(ols_results_list, ignore_index=True)
ols_results_df
| model_name | model_formula | num_coefs | R-squared | RMSE | |
|---|---|---|---|---|---|
| 0 | 0 | y ~ 1 | 1 | 0.000000 | 2.519164 |
| 1 | 1 | y ~ xa1 + xa2 + xa3 | 28 | 0.063357 | 2.438055 |
| 2 | 2 | y ~ xo1 + xo2 + xo3 + xo4 + xo5 + xo6 + xo7 + ... | 11 | 0.095695 | 2.395598 |
| 3 | 3 | y ~ xa1 + xa2 + xa3 + xo1 + xo2 + xo3 + xo4 + ... | 38 | 0.135581 | 2.342171 |
| 4 | 4 | y ~ (xo1 + xo2 + xo3 + xo4 + xo5 + xo6 + xo7 +... | 56 | 0.216473 | 2.229890 |
| 5 | 5 | y ~ (xo1 + xo2 + xo3 + xo4 + xo5 + xo6 + xo7 +... | 308 | 0.538940 | 1.710547 |
| 6 | 6 | y ~ (xa1 + xa2 + xa3) + (xo1 + xo2 + xo3 + xo4... | 48 | 0.150192 | 2.322292 |
| 7 | 7 | y ~ (xa1 + xa2 + xa3) * (xo1 + xo2 + xo3 + xo4... | 588 | 0.960028 | 0.503654 |
ols_results_df.sort_values(by=['R-squared'],ascending=False)
| model_name | model_formula | num_coefs | R-squared | RMSE | |
|---|---|---|---|---|---|
| 7 | 7 | y ~ (xa1 + xa2 + xa3) * (xo1 + xo2 + xo3 + xo4... | 588 | 0.960028 | 0.503654 |
| 5 | 5 | y ~ (xo1 + xo2 + xo3 + xo4 + xo5 + xo6 + xo7 +... | 308 | 0.538940 | 1.710547 |
| 4 | 4 | y ~ (xo1 + xo2 + xo3 + xo4 + xo5 + xo6 + xo7 +... | 56 | 0.216473 | 2.229890 |
| 6 | 6 | y ~ (xa1 + xa2 + xa3) + (xo1 + xo2 + xo3 + xo4... | 48 | 0.150192 | 2.322292 |
| 3 | 3 | y ~ xa1 + xa2 + xa3 + xo1 + xo2 + xo3 + xo4 + ... | 38 | 0.135581 | 2.342171 |
| 2 | 2 | y ~ xo1 + xo2 + xo3 + xo4 + xo5 + xo6 + xo7 + ... | 11 | 0.095695 | 2.395598 |
| 1 | 1 | y ~ xa1 + xa2 + xa3 | 28 | 0.063357 | 2.438055 |
| 0 | 0 | y ~ 1 | 1 | 0.000000 | 2.519164 |
ols_results_df.sort_values(by=['RMSE'],ascending=True)
| model_name | model_formula | num_coefs | R-squared | RMSE | |
|---|---|---|---|---|---|
| 7 | 7 | y ~ (xa1 + xa2 + xa3) * (xo1 + xo2 + xo3 + xo4... | 588 | 0.960028 | 0.503654 |
| 5 | 5 | y ~ (xo1 + xo2 + xo3 + xo4 + xo5 + xo6 + xo7 +... | 308 | 0.538940 | 1.710547 |
| 4 | 4 | y ~ (xo1 + xo2 + xo3 + xo4 + xo5 + xo6 + xo7 +... | 56 | 0.216473 | 2.229890 |
| 6 | 6 | y ~ (xa1 + xa2 + xa3) + (xo1 + xo2 + xo3 + xo4... | 48 | 0.150192 | 2.322292 |
| 3 | 3 | y ~ xa1 + xa2 + xa3 + xo1 + xo2 + xo3 + xo4 + ... | 38 | 0.135581 | 2.342171 |
| 2 | 2 | y ~ xo1 + xo2 + xo3 + xo4 + xo5 + xo6 + xo7 + ... | 11 | 0.095695 | 2.395598 |
| 1 | 1 | y ~ xa1 + xa2 + xa3 | 28 | 0.063357 | 2.438055 |
| 0 | 0 | y ~ 1 | 1 | 0.000000 | 2.519164 |
2.Which model has the best performance on the training set? Is the best model according to R-squared the SAME as the best model according to RMSE? Is the best model better than the INTERCEPT-ONLY model? How many coefficients are associated with the BEST model?
model 7 is the best.
Yes, the best model according to R-squared are the SAME as the best model according to RMSE.
Yes, the best model better than the INTERCEPT-ONLY model.
588 coefficients are associated with the best model.
E. Models: Predicitions¶
# the model with ALL inputs and linear additive features.
formula_list[3]
'y ~ xa1 + xa2 + xa3 + xo1 + xo2 + xo3 + xo4 + xo5 + xo6 + xo7 + xo8 + xo9 + xo10'
# the best model on the training set
formula_list[7]
'y ~ (xa1 + xa2 + xa3) * (xo1 + xo2 + xo3 + xo4 + xo5 + xo6 + xo7 + xo8 + xo9 + xo10 + np.power(xo1,2) + np.power(xo2,2) + np.power(xo3,2) + np.power(xo4,2) + np.power(xo5,2) + np.power(xo6,2) + np.power(xo7,2) + np.power(xo8,2) + np.power(xo9,2) + np.power(xo10,2))'
grid 1¶
You must identify the continuous input that you feel is the MOST important based on the statistically significant coefficients in your models.
Judge which of xo2 and xo3 is the most important input.
fit_xo2_xo3 = smf.ols(formula ='y ~ xo2 + xo3', data=df5_new).fit()
my_coefplot(fit_xo2_xo3)
fit_xo2_xo3.conf_int()
| 0 | 1 | |
|---|---|---|
| Intercept | -1.642103 | -1.097224 |
| xo2 | -0.859684 | -0.043477 |
| xo3 | 0.077954 | 0.894160 |
Therefore, xo3 as the most import continuous variable.
1.The MOST important input must have 101 unique values between the minimum and maximum training set values.
2.ALL other inputs must be set to CONSTANT values. Continuous inputs should use a CENTRAL value like the MEAN or MEDIAN. Categorical inputs should use the MOST frequent category.
input_grid = pd.DataFrame({'xo3' : np.linspace(df5_new.xo3.min(), df5_new.xo3.max(), num=101)})
input_grid['xo1'] = df5_new.xo1.mean()
input_grid['xo2'] = df5_new.xo2.mean()
input_grid['xo4'] = df5_new.xo4.mean()
input_grid['xo5'] = df5_new.xo5.mean()
input_grid['xo6'] = df5_new.xo6.mean()
input_grid['xo7'] = df5_new.xo7.mean()
input_grid['xo8'] = df5_new.xo8.mean()
input_grid['xo9'] = df5_new.xo9.mean()
input_grid['xo10'] = df5_new.xo10.mean()
input_grid['xa3'] = df5_new.xa3.value_counts().idxmax()
input_grid['xa1'] = df5_new.xa1.value_counts().idxmax()
input_grid['xa2'] = df5_new.xa2.value_counts().idxmax()
input_grid
| xo3 | xo1 | xo2 | xo4 | xo5 | xo6 | xo7 | xo8 | xo9 | xo10 | xa3 | xa1 | xa2 | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 0 | -6.037331 | -5.415722e-18 | -9.883693e-17 | -1.245616e-16 | 7.582011e-17 | 0.0 | -1.326852e-16 | 2.707861e-16 | 3.682691e-16 | 5.551115e-17 | rap | 1 | 1 |
| 1 | -5.959835 | -5.415722e-18 | -9.883693e-17 | -1.245616e-16 | 7.582011e-17 | 0.0 | -1.326852e-16 | 2.707861e-16 | 3.682691e-16 | 5.551115e-17 | rap | 1 | 1 |
| 2 | -5.882340 | -5.415722e-18 | -9.883693e-17 | -1.245616e-16 | 7.582011e-17 | 0.0 | -1.326852e-16 | 2.707861e-16 | 3.682691e-16 | 5.551115e-17 | rap | 1 | 1 |
| 3 | -5.804844 | -5.415722e-18 | -9.883693e-17 | -1.245616e-16 | 7.582011e-17 | 0.0 | -1.326852e-16 | 2.707861e-16 | 3.682691e-16 | 5.551115e-17 | rap | 1 | 1 |
| 4 | -5.727349 | -5.415722e-18 | -9.883693e-17 | -1.245616e-16 | 7.582011e-17 | 0.0 | -1.326852e-16 | 2.707861e-16 | 3.682691e-16 | 5.551115e-17 | rap | 1 | 1 |
| ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... |
| 96 | 1.402232 | -5.415722e-18 | -9.883693e-17 | -1.245616e-16 | 7.582011e-17 | 0.0 | -1.326852e-16 | 2.707861e-16 | 3.682691e-16 | 5.551115e-17 | rap | 1 | 1 |
| 97 | 1.479728 | -5.415722e-18 | -9.883693e-17 | -1.245616e-16 | 7.582011e-17 | 0.0 | -1.326852e-16 | 2.707861e-16 | 3.682691e-16 | 5.551115e-17 | rap | 1 | 1 |
| 98 | 1.557223 | -5.415722e-18 | -9.883693e-17 | -1.245616e-16 | 7.582011e-17 | 0.0 | -1.326852e-16 | 2.707861e-16 | 3.682691e-16 | 5.551115e-17 | rap | 1 | 1 |
| 99 | 1.634719 | -5.415722e-18 | -9.883693e-17 | -1.245616e-16 | 7.582011e-17 | 0.0 | -1.326852e-16 | 2.707861e-16 | 3.682691e-16 | 5.551115e-17 | rap | 1 | 1 |
| 100 | 1.712214 | -5.415722e-18 | -9.883693e-17 | -1.245616e-16 | 7.582011e-17 | 0.0 | -1.326852e-16 | 2.707861e-16 | 3.682691e-16 | 5.551115e-17 | rap | 1 | 1 |
101 rows × 13 columns
input_grid.nunique()
xo3 101 xo1 1 xo2 1 xo4 1 xo5 1 xo6 1 xo7 1 xo8 1 xo9 1 xo10 1 xa3 1 xa1 1 xa2 1 dtype: int64
Make predictions with BOTH models on the visualization grid. You MUST visualize the AVERAGE OUTPUT as a line, the CONFIDENCE INTERVAL as a grey ribbon, and the PREDICTION INTERVAL as an orange ribbon with respect to the most important input.
pred_summary_03 = fit_03.get_prediction(input_grid).summary_frame()
pred_summary_07 = fit_07.get_prediction(input_grid).summary_frame()
pred_summary_03.head()
| mean | mean_se | mean_ci_lower | mean_ci_upper | obs_ci_lower | obs_ci_upper | |
|---|---|---|---|---|---|---|
| 0 | -5.695331 | 1.601711 | -8.847301 | -2.543361 | -11.447229 | 0.056568 |
| 1 | -5.646030 | 1.584772 | -8.764665 | -2.527395 | -11.379729 | 0.087669 |
| 2 | -5.596729 | 1.567854 | -8.682071 | -2.511387 | -11.312388 | 0.118930 |
| 3 | -5.547428 | 1.550957 | -8.599520 | -2.495336 | -11.245207 | 0.150351 |
| 4 | -5.498127 | 1.534083 | -8.517013 | -2.479241 | -11.178188 | 0.181934 |
pred_summary_07.head()
| mean | mean_se | mean_ci_lower | mean_ci_upper | obs_ci_lower | obs_ci_upper | |
|---|---|---|---|---|---|---|
| 0 | 261.858156 | 490.585850 | -758.370971 | 1282.087282 | -758.379369 | 1282.095680 |
| 1 | 256.809651 | 478.723186 | -738.749715 | 1252.369017 | -738.758321 | 1252.377622 |
| 2 | 251.815892 | 467.009312 | -719.383140 | 1223.014924 | -719.391961 | 1223.023745 |
| 3 | 246.876879 | 455.444284 | -700.271360 | 1194.025119 | -700.280405 | 1194.034164 |
| 4 | 241.992613 | 444.028161 | -681.414498 | 1165.399724 | -681.423776 | 1165.409002 |
fig, ax = plt.subplots()
# prediction interval
ax.fill_between(input_grid.xo3,
pred_summary_03.obs_ci_lower,
pred_summary_03.obs_ci_upper,
facecolor='orange', alpha=0.75, edgecolor='orange')
#confidence interval
ax.fill_between(input_grid.xo3,
pred_summary_03.mean_ci_lower,
pred_summary_03.mean_ci_upper,
facecolor='grey',edgecolor='grey')
#trend
ax.plot(input_grid.xo3, pred_summary_03['mean'], color='k', linewidth=1)
#set labels
ax.set_xlabel('xo3')
ax.set_ylabel('y')
plt.show()
fig, ax = plt.subplots()
# prediction interval
ax.fill_between(input_grid.xo3,
pred_summary_07.obs_ci_lower,
pred_summary_07.obs_ci_upper,
facecolor='orange', alpha=0.75, edgecolor='orange')
#confidence interval
ax.fill_between(input_grid.xo3,
pred_summary_07.mean_ci_lower,
pred_summary_07.mean_ci_upper,
facecolor='grey',edgecolor='grey')
#trend
ax.plot(input_grid.xo3, pred_summary_07['mean'], color='k', linewidth=1)
#set labels
ax.set_xlabel('xo3')
ax.set_ylabel('y')
plt.show()
Comment: trends are different between the two models based on the same input xo3. xo3 in model_03 is an positive coef while slightly negative in model_07. The predictions and confidence intervals are both wider in model_07, indicating more uncertainty in the estimated mean response.
grid 2¶
Choose xo2, xo3, xa3 as inputs.
input_grid2 = pd.DataFrame([(xo3n, xo2n) for xo3n in np.linspace(df5_new.xo3.min(), df5_new.xo3.max(), num=101)
for xo2n in np.linspace(df5_new.xo2.min(), df5_new.xo2.max(), num=5)],
columns=['xo3n','xo2n'])
xa3n = df5_new.xa3.unique()
viz_grid2 = pd.DataFrame({'xo3' : np.repeat(input_grid2.xo3n,len(xa3n)),
'xo2' : np.repeat(input_grid2.xo2n,len(xa3n)),
'xa3' : np.tile(xa3n,len(input_grid2.xo3n))})
viz_grid2.nunique()
xo3 101 xo2 5 xa3 6 dtype: int64
viz_grid2['xo1'] = df5_new.xo1.mean()
viz_grid2['xo4'] = df5_new.xo4.mean()
viz_grid2['xo5'] = df5_new.xo5.mean()
viz_grid2['xo6'] = df5_new.xo6.mean()
viz_grid2['xo7'] = df5_new.xo7.mean()
viz_grid2['xo8'] = df5_new.xo8.mean()
viz_grid2['xo9'] = df5_new.xo9.mean()
viz_grid2['xo10'] = df5_new.xo10.mean()
viz_grid2['xa1'] = df5_new.xa1.value_counts().idxmax()
viz_grid2['xa2'] = df5_new.xa2.value_counts().idxmax()
viz_grid2
| xo3 | xo2 | xa3 | xo1 | xo4 | xo5 | xo6 | xo7 | xo8 | xo9 | xo10 | xa1 | xa2 | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 0 | -6.037331 | -3.645383 | r&b | -5.415722e-18 | -1.245616e-16 | 7.582011e-17 | 0.0 | -1.326852e-16 | 2.707861e-16 | 3.682691e-16 | 5.551115e-17 | 1 | 1 |
| 0 | -6.037331 | -3.645383 | rock | -5.415722e-18 | -1.245616e-16 | 7.582011e-17 | 0.0 | -1.326852e-16 | 2.707861e-16 | 3.682691e-16 | 5.551115e-17 | 1 | 1 |
| 0 | -6.037331 | -3.645383 | rap | -5.415722e-18 | -1.245616e-16 | 7.582011e-17 | 0.0 | -1.326852e-16 | 2.707861e-16 | 3.682691e-16 | 5.551115e-17 | 1 | 1 |
| 0 | -6.037331 | -3.645383 | latin | -5.415722e-18 | -1.245616e-16 | 7.582011e-17 | 0.0 | -1.326852e-16 | 2.707861e-16 | 3.682691e-16 | 5.551115e-17 | 1 | 1 |
| 0 | -6.037331 | -3.645383 | edm | -5.415722e-18 | -1.245616e-16 | 7.582011e-17 | 0.0 | -1.326852e-16 | 2.707861e-16 | 3.682691e-16 | 5.551115e-17 | 1 | 1 |
| ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... |
| 504 | 1.712214 | 1.608127 | rock | -5.415722e-18 | -1.245616e-16 | 7.582011e-17 | 0.0 | -1.326852e-16 | 2.707861e-16 | 3.682691e-16 | 5.551115e-17 | 1 | 1 |
| 504 | 1.712214 | 1.608127 | rap | -5.415722e-18 | -1.245616e-16 | 7.582011e-17 | 0.0 | -1.326852e-16 | 2.707861e-16 | 3.682691e-16 | 5.551115e-17 | 1 | 1 |
| 504 | 1.712214 | 1.608127 | latin | -5.415722e-18 | -1.245616e-16 | 7.582011e-17 | 0.0 | -1.326852e-16 | 2.707861e-16 | 3.682691e-16 | 5.551115e-17 | 1 | 1 |
| 504 | 1.712214 | 1.608127 | edm | -5.415722e-18 | -1.245616e-16 | 7.582011e-17 | 0.0 | -1.326852e-16 | 2.707861e-16 | 3.682691e-16 | 5.551115e-17 | 1 | 1 |
| 504 | 1.712214 | 1.608127 | pop | -5.415722e-18 | -1.245616e-16 | 7.582011e-17 | 0.0 | -1.326852e-16 | 2.707861e-16 | 3.682691e-16 | 5.551115e-17 | 1 | 1 |
3030 rows × 13 columns
viz_grid2.nunique()
xo3 101 xo2 5 xa3 6 xo1 1 xo4 1 xo5 1 xo6 1 xo7 1 xo8 1 xo9 1 xo10 1 xa1 1 xa2 1 dtype: int64
viz_grid2['pred03'] = fit_03.predict(viz_grid2)
viz_grid2['pred07'] = fit_07.predict(viz_grid2)
viz_grid2
| xo3 | xo2 | xa3 | xo1 | xo4 | xo5 | xo6 | xo7 | xo8 | xo9 | xo10 | xa1 | xa2 | pred03 | pred07 | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 0 | -6.037331 | -3.645383 | r&b | -5.415722e-18 | -1.245616e-16 | 7.582011e-17 | 0.0 | -1.326852e-16 | 2.707861e-16 | 3.682691e-16 | 5.551115e-17 | 1 | 1 | -2.991439 | 92.668548 |
| 0 | -6.037331 | -3.645383 | rock | -5.415722e-18 | -1.245616e-16 | 7.582011e-17 | 0.0 | -1.326852e-16 | 2.707861e-16 | 3.682691e-16 | 5.551115e-17 | 1 | 1 | -2.181187 | -1066.974591 |
| 0 | -6.037331 | -3.645383 | rap | -5.415722e-18 | -1.245616e-16 | 7.582011e-17 | 0.0 | -1.326852e-16 | 2.707861e-16 | 3.682691e-16 | 5.551115e-17 | 1 | 1 | -3.156076 | 89.381058 |
| 0 | -6.037331 | -3.645383 | latin | -5.415722e-18 | -1.245616e-16 | 7.582011e-17 | 0.0 | -1.326852e-16 | 2.707861e-16 | 3.682691e-16 | 5.551115e-17 | 1 | 1 | -3.158265 | -298.129955 |
| 0 | -6.037331 | -3.645383 | edm | -5.415722e-18 | -1.245616e-16 | 7.582011e-17 | 0.0 | -1.326852e-16 | 2.707861e-16 | 3.682691e-16 | 5.551115e-17 | 1 | 1 | -0.982070 | -418.706528 |
| ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... |
| 504 | 1.712214 | 1.608127 | rock | -5.415722e-18 | -1.245616e-16 | 7.582011e-17 | 0.0 | -1.326852e-16 | 2.707861e-16 | 3.682691e-16 | 5.551115e-17 | 1 | 1 | -0.910516 | -87.634972 |
| 504 | 1.712214 | 1.608127 | rap | -5.415722e-18 | -1.245616e-16 | 7.582011e-17 | 0.0 | -1.326852e-16 | 2.707861e-16 | 3.682691e-16 | 5.551115e-17 | 1 | 1 | -1.885405 | -0.145164 |
| 504 | 1.712214 | 1.608127 | latin | -5.415722e-18 | -1.245616e-16 | 7.582011e-17 | 0.0 | -1.326852e-16 | 2.707861e-16 | 3.682691e-16 | 5.551115e-17 | 1 | 1 | -1.887593 | -33.475674 |
| 504 | 1.712214 | 1.608127 | edm | -5.415722e-18 | -1.245616e-16 | 7.582011e-17 | 0.0 | -1.326852e-16 | 2.707861e-16 | 3.682691e-16 | 5.551115e-17 | 1 | 1 | 0.288602 | -45.670132 |
| 504 | 1.712214 | 1.608127 | pop | -5.415722e-18 | -1.245616e-16 | 7.582011e-17 | 0.0 | -1.326852e-16 | 2.707861e-16 | 3.682691e-16 | 5.551115e-17 | 1 | 1 | -0.204275 | -135.013879 |
3030 rows × 15 columns
sns.relplot(data=viz_grid2,
x='xo3', y = 'pred03', kind='line',
hue='xo2',palette='coolwarm',col='xa3', col_wrap=3,
estimator=None, units='xo2')
plt.show()
C:\Users\russell\anaconda3\envs\cmpinf2100\lib\site-packages\seaborn\axisgrid.py:123: UserWarning: The figure layout has changed to tight self._figure.tight_layout(*args, **kwargs)
sns.relplot(data=viz_grid2,
x='xo3', y = 'pred07', kind='line',
hue='xo2',palette='coolwarm',col='xa3', col_wrap=3,
estimator=None, units='xo2')
plt.show()
C:\Users\russell\anaconda3\envs\cmpinf2100\lib\site-packages\seaborn\axisgrid.py:123: UserWarning: The figure layout has changed to tight self._figure.tight_layout(*args, **kwargs)
Trends: In Model 03, predictions have more solid trends on variable xo3, xo3 and xo2 are linear additive features. Additionally, different categories of xa3 has different effect on the trends. In model 07, the trends are not obvious on xo3, xo2 and xa3.
Uncertainty: I think model 07 has higher prediction uncertainty than model 03.
F. Model performance and validation¶
Select model 7 since the formulation that was the best model on the training set.
Choose model 2 and model 6 as 2 additional formulations. model 2 is simple (few features), and model 6 is of medium to high complexity .
df5_new
| xo1 | xo2 | xo3 | xo4 | xo5 | xo6 | xo7 | xo8 | xo9 | xo10 | xa1 | xa2 | xa3 | y | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 0 | 0.228518 | -1.013955 | -0.767625 | -0.943733 | 1.303515 | -0.210842 | 2.187045 | -0.936229 | 0.028455 | 0.059377 | 9 | 9 | r&b | -0.200671 |
| 1 | -0.453146 | 0.101941 | 0.358667 | -0.675642 | -0.611601 | -0.210842 | -0.679626 | -2.274030 | -0.933172 | 1.500060 | 1 | 1 | rock | 0.281851 |
| 2 | -0.703672 | -0.244371 | -1.458995 | 0.768784 | -0.788545 | 0.190331 | -0.309733 | -0.338908 | 1.453963 | -0.317121 | 4 | 4 | rock | 0.200671 |
| 3 | -0.703672 | -0.244371 | -1.458995 | 0.768784 | -0.788545 | 0.190331 | -0.309733 | -0.338908 | 1.453963 | -0.317121 | 4 | 4 | rock | 0.200671 |
| 4 | -0.849326 | 0.827549 | 0.563381 | 1.533530 | 0.529784 | -0.210842 | -0.053654 | 0.821178 | 0.976431 | 0.724371 | 4 | 4 | rap | -0.847298 |
| ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... |
| 323 | -1.519337 | 1.234329 | 0.673867 | -0.704857 | -0.812472 | -0.180955 | -0.466227 | -0.723958 | -0.779263 | -0.416850 | 9 | 9 | rock | 0.800119 |
| 324 | 1.114098 | 0.266852 | -0.188220 | -0.159224 | -0.694844 | 4.214700 | -0.082107 | -1.642153 | -0.719894 | -0.723362 | 2 | 2 | rap | -0.160343 |
| 325 | -0.820195 | 0.613165 | 0.877472 | -0.709154 | -0.503279 | -0.210842 | -0.437773 | 0.776749 | -0.974514 | -0.612667 | 2 | 2 | pop | 0.160343 |
| 326 | -2.218480 | -0.189401 | -0.296489 | -0.878429 | -0.798043 | -0.170035 | -0.614895 | -1.686582 | 0.251595 | 0.692511 | 11 | 11 | rock | -3.891820 |
| 327 | 0.717917 | -1.200854 | -0.179352 | -0.270928 | 0.860621 | -0.210842 | -0.395093 | 0.411445 | 0.094812 | -0.328870 | 1 | 1 | r&b | 0.663294 |
328 rows × 14 columns
from sklearn.model_selection import KFold
from sklearn.linear_model import LinearRegression
from sklearn.model_selection import cross_val_score
from patsy import dmatrices
Choose 10-fold using regular k-fold.
Use R-squared and RMSE as performance metric.
kf = KFold(n_splits=10, shuffle=True, random_state=101)
kf.get_n_splits()
10
sk_lm = LinearRegression(fit_intercept=False)
def lm_cross_val_score(mod_name, a_formula, init_mod, the_data, cv):
#create features and output arrays
y, X = dmatrices(a_formula, data=the_data)
#train and test within each fold: return thest set scores
##rsquared
test_r2 = cross_val_score(init_mod, X, y.ravel(), cv=cv)
##rmse
test_rmse = -cross_val_score(init_mod, X, y.ravel(), cv=cv, scoring = 'neg_root_mean_squared_error')
#book keeping
res_df = pd.DataFrame({'R-squared': test_r2,
'RMSE': test_rmse})
res_df['fold_id'] = res_df.index + 1
res_df['model_name'] = mod_name
res_df['model_formula'] = a_formula
res_df['num_coefs'] = X.shape[1]
return res_df
cv_score_list = []
cv_score_list.append(lm_cross_val_score(2,formula_list[2],init_mod=sk_lm, the_data=df5_new, cv=kf))
cv_score_list.append(lm_cross_val_score(6,formula_list[6],init_mod=sk_lm, the_data=df5_new, cv=kf))
cv_score_list.append(lm_cross_val_score(7,formula_list[7],init_mod=sk_lm, the_data=df5_new, cv=kf))
cv_score_df = pd.concat(cv_score_list, ignore_index=True)
cv_score_df
| R-squared | RMSE | fold_id | model_name | model_formula | num_coefs | |
|---|---|---|---|---|---|---|
| 0 | -1.620704e-02 | 2.304654e+00 | 1 | 2 | y ~ xo1 + xo2 + xo3 + xo4 + xo5 + xo6 + xo7 + ... | 11 |
| 1 | 9.741837e-02 | 2.530741e+00 | 2 | 2 | y ~ xo1 + xo2 + xo3 + xo4 + xo5 + xo6 + xo7 + ... | 11 |
| 2 | 1.003722e-01 | 2.482046e+00 | 3 | 2 | y ~ xo1 + xo2 + xo3 + xo4 + xo5 + xo6 + xo7 + ... | 11 |
| 3 | 1.243057e-01 | 2.123039e+00 | 4 | 2 | y ~ xo1 + xo2 + xo3 + xo4 + xo5 + xo6 + xo7 + ... | 11 |
| 4 | 8.218282e-03 | 2.170871e+00 | 5 | 2 | y ~ xo1 + xo2 + xo3 + xo4 + xo5 + xo6 + xo7 + ... | 11 |
| 5 | -2.219856e-01 | 2.034001e+00 | 6 | 2 | y ~ xo1 + xo2 + xo3 + xo4 + xo5 + xo6 + xo7 + ... | 11 |
| 6 | 3.974326e-03 | 2.532431e+00 | 7 | 2 | y ~ xo1 + xo2 + xo3 + xo4 + xo5 + xo6 + xo7 + ... | 11 |
| 7 | -7.919117e-02 | 2.893148e+00 | 8 | 2 | y ~ xo1 + xo2 + xo3 + xo4 + xo5 + xo6 + xo7 + ... | 11 |
| 8 | 1.491614e-02 | 2.692573e+00 | 9 | 2 | y ~ xo1 + xo2 + xo3 + xo4 + xo5 + xo6 + xo7 + ... | 11 |
| 9 | -1.139074e-01 | 2.966316e+00 | 10 | 2 | y ~ xo1 + xo2 + xo3 + xo4 + xo5 + xo6 + xo7 + ... | 11 |
| 10 | -1.106629e-01 | 2.409383e+00 | 1 | 6 | y ~ (xa1 + xa2 + xa3) + (xo1 + xo2 + xo3 + xo4... | 48 |
| 11 | -2.021951e-01 | 2.920734e+00 | 2 | 6 | y ~ (xa1 + xa2 + xa3) + (xo1 + xo2 + xo3 + xo4... | 48 |
| 12 | -6.416160e-02 | 2.699493e+00 | 3 | 6 | y ~ (xa1 + xa2 + xa3) + (xo1 + xo2 + xo3 + xo4... | 48 |
| 13 | -1.828822e-04 | 2.268931e+00 | 4 | 6 | y ~ (xa1 + xa2 + xa3) + (xo1 + xo2 + xo3 + xo4... | 48 |
| 14 | -2.296802e-01 | 2.417252e+00 | 5 | 6 | y ~ (xa1 + xa2 + xa3) + (xo1 + xo2 + xo3 + xo4... | 48 |
| 15 | -4.449769e-01 | 2.211813e+00 | 6 | 6 | y ~ (xa1 + xa2 + xa3) + (xo1 + xo2 + xo3 + xo4... | 48 |
| 16 | -5.815719e-03 | 2.544846e+00 | 7 | 6 | y ~ (xa1 + xa2 + xa3) + (xo1 + xo2 + xo3 + xo4... | 48 |
| 17 | -4.884900e-02 | 2.852186e+00 | 8 | 6 | y ~ (xa1 + xa2 + xa3) + (xo1 + xo2 + xo3 + xo4... | 48 |
| 18 | -1.005767e-01 | 2.846040e+00 | 9 | 6 | y ~ (xa1 + xa2 + xa3) + (xo1 + xo2 + xo3 + xo4... | 48 |
| 19 | -5.020146e-01 | 3.444530e+00 | 10 | 6 | y ~ (xa1 + xa2 + xa3) + (xo1 + xo2 + xo3 + xo4... | 48 |
| 20 | -8.971546e+02 | 6.851571e+01 | 1 | 7 | y ~ (xa1 + xa2 + xa3) * (xo1 + xo2 + xo3 + xo4... | 588 |
| 21 | -6.585992e+02 | 6.841393e+01 | 2 | 7 | y ~ (xa1 + xa2 + xa3) * (xo1 + xo2 + xo3 + xo4... | 588 |
| 22 | -1.063349e+02 | 2.711122e+01 | 3 | 7 | y ~ (xa1 + xa2 + xa3) * (xo1 + xo2 + xo3 + xo4... | 588 |
| 23 | -3.769258e+19 | 1.392867e+10 | 4 | 7 | y ~ (xa1 + xa2 + xa3) * (xo1 + xo2 + xo3 + xo4... | 588 |
| 24 | -3.041955e+02 | 3.808158e+01 | 5 | 7 | y ~ (xa1 + xa2 + xa3) * (xo1 + xo2 + xo3 + xo4... | 588 |
| 25 | -1.789580e+03 | 7.786010e+01 | 6 | 7 | y ~ (xa1 + xa2 + xa3) * (xo1 + xo2 + xo3 + xo4... | 588 |
| 26 | -1.046066e+03 | 8.210874e+01 | 7 | 7 | y ~ (xa1 + xa2 + xa3) * (xo1 + xo2 + xo3 + xo4... | 588 |
| 27 | -1.711361e+02 | 3.653907e+01 | 8 | 7 | y ~ (xa1 + xa2 + xa3) * (xo1 + xo2 + xo3 + xo4... | 588 |
| 28 | -5.008654e+03 | 1.920148e+02 | 9 | 7 | y ~ (xa1 + xa2 + xa3) * (xo1 + xo2 + xo3 + xo4... | 588 |
| 29 | -7.855123e+02 | 7.882167e+01 | 10 | 7 | y ~ (xa1 + xa2 + xa3) * (xo1 + xo2 + xo3 + xo4... | 588 |
Visualize the CROSS-VALIDATION results by showing the AVERAGE CROSS-VALIDATION performance metric with the 95% confidence interval for each model.
sns.catplot(data=cv_score_df, x='model_name', y='R-squared', kind='point', join=False, errorbar=('ci',95))
plt.show()
C:\Users\russell\AppData\Local\Temp\ipykernel_16296\3196590551.py:1: UserWarning:
The `join` parameter is deprecated and will be removed in v0.15.0. You can remove the line between points with `linestyle='none'`.
sns.catplot(data=cv_score_df, x='model_name', y='R-squared', kind='point', join=False, errorbar=('ci',95))
C:\Users\russell\anaconda3\envs\cmpinf2100\lib\site-packages\seaborn\axisgrid.py:123: UserWarning: The figure layout has changed to tight
self._figure.tight_layout(*args, **kwargs)
sns.catplot(data=cv_score_df, x='model_name', y='RMSE', kind='point', join=False, errorbar=('ci',95))
plt.show()
C:\Users\russell\AppData\Local\Temp\ipykernel_16296\1021815062.py:1: UserWarning:
The `join` parameter is deprecated and will be removed in v0.15.0. You can remove the line between points with `linestyle='none'`.
sns.catplot(data=cv_score_df, x='model_name', y='RMSE', kind='point', join=False, errorbar=('ci',95))
C:\Users\russell\anaconda3\envs\cmpinf2100\lib\site-packages\seaborn\axisgrid.py:123: UserWarning: The figure layout has changed to tight
self._figure.tight_layout(*args, **kwargs)
sns.catplot(data=cv_score_df.loc[cv_score_df.model_name != 7,:], x='model_name', y='R-squared', kind='point', join=False, errorbar=('ci',95))
plt.show()
C:\Users\russell\AppData\Local\Temp\ipykernel_16296\2443661836.py:1: UserWarning:
The `join` parameter is deprecated and will be removed in v0.15.0. You can remove the line between points with `linestyle='none'`.
sns.catplot(data=cv_score_df.loc[cv_score_df.model_name != 7,:], x='model_name', y='R-squared', kind='point', join=False, errorbar=('ci',95))
C:\Users\russell\anaconda3\envs\cmpinf2100\lib\site-packages\seaborn\axisgrid.py:123: UserWarning: The figure layout has changed to tight
self._figure.tight_layout(*args, **kwargs)
sns.catplot(data=cv_score_df.loc[cv_score_df.model_name != 7,:], x='model_name', y='RMSE', kind='point', join=False, errorbar=('ci',95))
plt.show()
C:\Users\russell\AppData\Local\Temp\ipykernel_16296\3448828944.py:1: UserWarning:
The `join` parameter is deprecated and will be removed in v0.15.0. You can remove the line between points with `linestyle='none'`.
sns.catplot(data=cv_score_df.loc[cv_score_df.model_name != 7,:], x='model_name', y='RMSE', kind='point', join=False, errorbar=('ci',95))
C:\Users\russell\anaconda3\envs\cmpinf2100\lib\site-packages\seaborn\axisgrid.py:123: UserWarning: The figure layout has changed to tight
self._figure.tight_layout(*args, **kwargs)
sns.catplot(data=cv_score_df.loc[cv_score_df.model_name != 7,:], x='model_name', y='R-squared', kind='point', join=False, errorbar=('ci',68))
plt.show()
C:\Users\russell\AppData\Local\Temp\ipykernel_16296\1646769289.py:1: UserWarning:
The `join` parameter is deprecated and will be removed in v0.15.0. You can remove the line between points with `linestyle='none'`.
sns.catplot(data=cv_score_df.loc[cv_score_df.model_name != 7,:], x='model_name', y='R-squared', kind='point', join=False, errorbar=('ci',68))
C:\Users\russell\anaconda3\envs\cmpinf2100\lib\site-packages\seaborn\axisgrid.py:123: UserWarning: The figure layout has changed to tight
self._figure.tight_layout(*args, **kwargs)
sns.catplot(data=cv_score_df.loc[cv_score_df.model_name != 7,:], x='model_name', y='RMSE', kind='point', join=False, errorbar=('ci',68))
plt.show()
C:\Users\russell\AppData\Local\Temp\ipykernel_16296\1947586138.py:1: UserWarning:
The `join` parameter is deprecated and will be removed in v0.15.0. You can remove the line between points with `linestyle='none'`.
sns.catplot(data=cv_score_df.loc[cv_score_df.model_name != 7,:], x='model_name', y='RMSE', kind='point', join=False, errorbar=('ci',68))
C:\Users\russell\anaconda3\envs\cmpinf2100\lib\site-packages\seaborn\axisgrid.py:123: UserWarning: The figure layout has changed to tight
self._figure.tight_layout(*args, **kwargs)
len(fit_02.params)
11
fit_02.pvalues < 0.05
Intercept True xo1 False xo2 True xo3 True xo4 True xo5 False xo6 False xo7 False xo8 False xo9 True xo10 True dtype: bool
Model 2 is the BEST according to CROSS-VALIDATION, which is DIFFERENT from the model 7 identified as the BEST according to the training set.
11 regression coefiicients are assocaited with the best model, 6 of them (including intercept) are statistically significant.
The practice of cross-validation and also supplemental predictions away from the rest of the model lists could be found in the supplemental file.